← Back

2026-05-02

dbt, Spark, or PL/SQL: Why the Tool Choice Is the Wrong Conversation

The data engineering tooling conversation follows a reliable cycle. A new tool emerges that solves a real problem in a genuinely better way. Early adopters succeed. Case studies are written. Conference talks are given. The tool becomes the consensus choice. Organizations adopt it — some because it solves their problem, many because it's what the job postings require and what the consultants recommend.

Then the next cycle starts.

I've watched this happen with data warehousing platforms, ETL tools, orchestration frameworks, and now the modern data stack. Each time, the organizations that fare best are not the ones who adopt earliest or latest. They're the ones who stay focused on the problem rather than the tool.

What Each Tool Is Actually Good At

Let me be specific, because the debate is often conducted at too high a level of abstraction.

PL/SQL in Oracle environments is mature, deeply integrated with the database engine, and extremely well-suited to transactional financial data processing. In regulated financial services in Turkey, where Oracle is the dominant database platform, PL/SQL pipelines benefit from decades of optimization, a deep talent pool, and direct integration with the database features that matter for compliance: partitioning, row-level security, audit triggers, and fine-grained access control. The argument against PL/SQL is rarely technical. It's usually that it's "old" and hiring is hard. Both are manageable with the right approach.

dbt solves a specific problem extremely well: transforming data that's already in a modern cloud data warehouse using version-controlled, testable SQL. The documentation, lineage, and testing capabilities are genuinely superior to what most teams build by hand. Where dbt falls short is in environments where the data isn't in a Snowflake/BigQuery/Redshift-shaped warehouse, where transformations require procedural logic that SQL can't express cleanly, and where the operational overhead of the modern data stack isn't justified by the scale or complexity of the problem.

Spark addresses a genuine problem: data volumes that exceed what a single-node database can handle, or processing patterns that benefit from distributed computation. In most Turkish financial services environments, that problem doesn't exist at the scale that requires Spark's complexity. I've seen Spark adopted in environments where a well-tuned Oracle query would have been faster, simpler, and operationally less expensive. The cost was two years of complexity for no meaningful performance gain.

The Question Nobody Asks

Before evaluating tools, the question that should come first: what is the actual constraint?

If the constraint is data volume, tools that scale horizontally matter. If the constraint is transformation complexity, tools with expressive logic matter. If the constraint is auditability and lineage, tools with native documentation matter. If the constraint is operational cost, tools your team already knows matter most.

In regulated financial services, the constraints I encounter most frequently are: auditability, change management under regulatory pressure, and organizational knowledge concentration. None of these are primarily tool problems. They're architecture and process problems that exist regardless of what tool you're using.

The Migration Cost Nobody Budgets

Every tool migration has a hidden cost that rarely appears in the business case: the loss of embedded business logic.

A PL/SQL package that's been running production compliance reporting for eight years contains business rules. Some are documented. Many aren't — they're encoded in the logic itself, understood implicitly by the team members who've maintained it. When you migrate to dbt, you don't just port the SQL. You have to understand and re-implement every business rule, including the ones nobody wrote down.

That work is expensive. It requires the people who know the current system to spend months on the migration instead of on new work. It introduces re-implementation risk — the new system that passes the same tests as the old system but handles an edge case differently.

I've seen organizations budget the infrastructure cost of a migration and ignore the knowledge transfer cost. The result is migrations that take three times as long as planned and produce systems that are technically modern and behaviorally uncertain.

What Actually Matters

The question to answer before any tool decision: does my current tooling prevent me from doing something I need to do? Not: is there a newer tool that other organizations are using?

If the answer is yes — the current tooling is actually blocking work — evaluate alternatives. Evaluate them against your specific constraints, not against theoretical scale or industry benchmarks that don't reflect your environment.

If the answer is no — the current tooling works, is understood, and is maintainable — the case for migration needs to be based on a specific problem it solves, not on trend adoption.


The best data tooling decision I've made wasn't about choosing the most modern stack. It was about knowing which tool was the right one for the problem at hand, regardless of what was popular that year. That judgment is rarer than technical skill, and worth more.