Onur Altıntaşlı — Data Management Leader

Most ETL engineers in Turkish financial services come up through banking. They learn to move transactions, reconcile end-of-day positions, build CDC pipelines off Oracle GoldenGate, maybe a Kafka stream for card authorizations. Then they get assigned to a pension project — BES, OKS, or the HAYMER submission pipeline itself — and within two months the assumptions they built their career on start failing in subtle, expensive ways.

HAYMER (Hayat ve Emeklilik Veri Yönetim Merkezi) is not just another regulatory reporting endpoint. The submission structure forces a class of pipeline design problems that simply do not exist in transactional banking ETL. After enough years building both sides, I can say this without hedging: if you treat HAYMER like a slightly stricter BDDK report, you will ship a pipeline that is technically green and operationally wrong.

Here is what actually makes it different.

The participant is not a row, it is a timeline

In banking ETL, an account is a relatively stable entity. It has an open date, a close date, maybe a status change or two, and the interesting data lives in the transactions hanging off it. You can model it as a slowly changing dimension and move on.

A pension participant is the opposite. The participant is the state machine. Across a single contract you track:

Entry date into the system (not the same as contract start)
State transitions: active, paid-up (ödemeye ara), transferred-in, transferred-out, vested, retired, deceased, beneficiary-claimed
Contribution suspension and resumption windows
Employer changes within OKS (auto-enrollment) without contract termination
Fund allocation changes that must be reconstructible at any historical date
Stake (hak kazanma) accrual that depends on continuous time-in-system, not calendar time

A Type 2 SCD with valid_from / valid_to is not enough. You need bitemporal modeling — system time and business effective time — because HAYMER will ask you, three years from now, what the participant's state was on a specific date, as you knew it on that date, not as you know it today. Retroactive corrections happen constantly, and they cannot overwrite history.

If your pipeline cannot answer "what did we believe to be true on 14 March 2022, and what is true now about 14 March 2022" as two distinct queries, you are going to fail an audit.

Temporal precision is not a nice-to-have

In a card authorization pipeline, if a transaction timestamp drifts by 200 milliseconds, nobody dies. In HAYMER, the date a contribution is credited versus the date it is invested versus the date the unit price applies — these are three different timestamps, and confusing them changes the participant's unit count.

Concrete example: a contribution arrives at the pension company on a Friday after the fund cutoff. It is credited Friday but invested at Monday's unit price. If your ETL writes the investment date as Friday because that is when the cash hit, the participant's units are calculated against the wrong NAV. Over thousands of participants and years of compounding, the gap is real money. And HAYMER reconciliation will eventually surface it — usually during a regulator review, never when convenient.

This is why the standard "truncate and reload daily" pattern that works fine for management reporting is malpractice here. Every value needs a provenance: which event produced it, at what business time, with what unit price reference, against which fund definition version.

Reference data is not reference data

In a typical ETL you treat fund codes, fee schedules, and tax tables as reference data. You join against them. They change occasionally and you update the dimension.

In pension pipelines, these are contractual artifacts. The fee schedule that applied to a participant who entered in 2018 is not the same as the one that applies to a 2024 entrant, even on the same product. When fees are recalculated retroactively because of a system error, you need to know which version of which schedule was in force for which participant on which date — and apply corrections without disturbing the participants who were correctly billed.

This means reference tables themselves need bitemporal versioning, and your join logic needs to resolve the correct version per participant per event. A standard star schema join will silently use the latest version and give you numbers that look right and are wrong.

Late-arriving data is the rule, not the edge case

Banking ETL treats late-arriving data as an exception path. You handle it, but most of your volume is on-time.

In pension data, late arrivals are structural:

Employer contribution files for OKS arrive weeks after the payroll period they cover
Transfer-in (aktarım) data from another pension company can lag 30-60 days and brings historical state with it
Death notifications come from MERNIS asynchronously and require unwinding contributions made after the date of death
Court orders and beneficiary disputes inject corrections months or years later

A pipeline designed around "yesterday's data, processed tonight" cannot absorb this. You need event sourcing or at minimum an append-only contribution ledger where corrections are new events, not updates. Mutability anywhere in this chain will eventually corrupt a participant balance.

Reconciliation is N-way, not 2-way

In banking you reconcile your books to the general ledger and maybe to the clearing house. Two-way reconciliation, mostly automated.

HAYMER pipelines reconcile across at minimum:

Internal contribution ledger
Custodian (saklayıcı kuruluş) fund unit records
HAYMER-submitted state
Tax authority records for state contribution (devlet katkısı) eligibility
Employer-reported amounts for OKS

When these disagree — and they will — the question is not "which one is right." The question is "which one is right for this specific value at this specific time," because each system has its own latency, its own corrections, and its own version of truth. Your pipeline needs to materialize the discrepancies as first-class data, not log them as errors.

What this means for design

If you are starting or rebuilding a HAYMER pipeline, the assumptions that need to go in the first week:

Forget last-write-wins. Everything is an event with effective time and recorded time.
Forget mutable participant state. State is a projection of the event log, recomputable for any historical date.
Forget treating reference data as static lookups. Version everything that touches a calculation.
Forget batch-only thinking. The arrival pattern is bursty, late, and bidirectional with corrections flowing back.
Forget single-source-of-truth rhetoric. Build for reconciliation across competing truths.

The engineers who do well on these projects are usually not the ones with the most ETL tooling experience. They are the ones who can think in terms of accounting ledgers, audit trails, and state machines — and who internalize that the row they are processing represents thirty years of someone's working life turned into a retirement income. That framing changes the design choices you make at 2am when a deadline is closing in.

Get it wrong on a card transaction, you reverse the entry. Get it wrong on a pension contribution from 2019, you are explaining to a 64-year-old why their monthly income is lower than the statement they have been reading for six years.

That is the difference. Build accordingly.