# Inside a Bank (3/3): The System Architecture of a Bank

The third part of the 'Inside a Bank' series descends the elevator to the technical floor: it explains the double-entry ledger, the engines that compose a core banking system, why idempotency is the central problem in money movement, and how resilience and security work when a bug means real financial loss. For developers and architects moving into financial services.

- URL: https://fernando.moretes.com/studies/banco-por-dentro-3-arquitetura-de-sistemas

- Markdown: https://fernando.moretes.com/studies/banco-por-dentro-3-arquitetura-de-sistemas/study.md?lang=en

- Type: Guide / Deep Dive

- Domain: Mercado Financeiro

- Date: 2026-06-21

- Tags: core-banking, ledger, idempotency, event-sourcing, financial-architecture, bacen, hsm, settlement

- Reading time: 8 min

---

If you come from the software world and are entering financial services, you have probably heard that 'banking is different'. But different *how*? Part 1 of this series explained the banking business and regulation (BACEN, SPB). Part 2 covered products and payment rails (PIX, TED, boleto). Now we descend the elevator — in Gregor Hohpe's sense — to the floor where engineers live: the technical core of a bank. Here you will understand why a bank is not just a CRUD with a balance, why idempotency is non-negotiable, and why 'eventual consistency' that is acceptable in your e-commerce is dangerous when the asset is real money.

## What you will learn

- What a double-entry ledger is and why it is immutable by design
- What the 'engines' of a core banking system are and what each one does
- Why idempotency is the central problem in money systems
- How event queues and safe reprocessing fit into this context
- Why strong consistency matters here (and what the Coinbase/AWS MSK incident teaches)
- How HSM and auditability work in practice

## Quick Glossary — Technical Floor Terms

- **Core Banking:** The central system that holds accounts, balances and movements. It is the bank's 'source of truth database'.
- **Ledger:** Accounting record of all movements. Each row is an entry; it is never deleted, only reversed.
- **Double-Entry:** Accounting principle: every debit has a matching credit. The sum is always zero. Invented in 1494 by Luca Pacioli.
- **Idempotency:** Property of an operation that can be executed multiple times with the same result. In money: processing the same transfer twice must not debit twice.
- **Settlement:** The moment when money actually changes ownership in the central bank system (BACEN). Before that it is just a promise.
- **Reconciliation:** Process of comparing two independent records to ensure they agree. If they diverge, there is a problem.
- **HSM (Hardware Security Module):** Physical device dedicated to cryptographic operations. Keys never leave the hardware. Mandatory for card operations and transaction signing.
- **Boleto Engine:** Engine that generates and registers bank slips (boletos) in the bank's system and at CIP/BACEN.

## The Double-Entry Ledger: the Git of Money

Think of Git: you never delete a commit, you only add new ones. The history is immutable and auditable. The bank ledger works exactly the same way — and for reasons that go far beyond technical preference.

Double-entry accounting says every money movement has **two sides**: a debit and a credit. When you transfer R$ 100 from your account to someone else's, the system records: debit on your account (balance decreases) and credit on the recipient's account (balance increases). The algebraic sum is always zero. This is not a bank rule — it is a 500-year-old accounting principle that guarantees money neither disappears nor appears from nowhere.

From an engineering perspective, this maps directly to **Event Sourcing** (Martin Fowler): the current state (balance) is not stored directly — it is *derived* from the immutable sequence of events (entries). You can recalculate the balance of any account at any point in time by simply replaying events up to that date. This has profound consequences: auditing is trivial, rollback is a reversal entry (new event, not deletion), and inconsistencies become visible because the sum of both sides never closes.

For a dev coming from CRUD: forget `UPDATE balance = balance - 100`. In the ledger you do `INSERT INTO entries (account, type, amount, idempotency_key, timestamp)`. The balance is a view, not a column.

## Core Banking Architecture: the Engines and their Responsibilities

Capability view of a modern digital bank. Each 'engine' is a domain with a single responsibility. Arrows show main dependencies and data flow.

### 🏦 Canais / Channels

- App / Internet Banking (frontend)
- API Gateway + Auth (mTLS) (edge)

### ⚙️ Motores de Negócio / Business Engines

- Motor de Contas Cadastro + KYC ref (compute)
- Motor de Limites PIX, TED, cartão (compute)
- Motor de Crédito Score + concessão (compute)
- Boletador Geração + registro CIP (compute)
- Motor de Pagamentos Orquestração PIX/TED (compute)

### 📒 Ledger Central / Central Ledger

- Ledger Service Dupla entrada, imutável (compute)
- Ledger DB (append-only, ACID) (data)
- Motor de Liquidação SPB / STR / SPI (compute)
- Conciliação Comparação bilateral (compute)

### 📨 Mensageria / Messaging

- Event Bus (Kafka / MSK) (messaging)
- Dead Letter Queue Reprocessamento seguro (messaging)

### 🔐 Segurança / Security

- HSM Chaves nunca saem (security)
- Audit Log Imutável, WORM (storage)

### 🏛️ Reguladores / Regulators

- BACEN / STR / SPI Liquidação final (external)
- CIP Registro boletos (external)

### Flows

- app -> api_gw: HTTPS/mTLS
- api_gw -> payments: request
- api_gw -> accounts: request
- payments -> limits: validate limit
- payments -> ledger_svc: entry
- ledger_svc -> ledger_db: INSERT (append)
- ledger_svc -> event_bus: event published
- event_bus -> settlement: triggers settlement
- event_bus -> reconciliation: triggers reconciliation
- event_bus -> dlq: failure → DLQ
- settlement -> bacen: SPB message
- boleto -> cip: registration
- payments -> hsm: signs transaction
- ledger_svc -> audit_log: immutable log
- credit -> ledger_svc: disbursement

## The Engines: Each Domain has a Single Responsibility

A bank is not a monolith with a balance table. It is a set of **specialized engines**, each with clear responsibility and well-defined boundaries — what a software architect would recognize as Domain-Driven Design applied to decades of regulation.

**Accounts Engine**: maintains customer and account records. Does not touch balances. Think of it as the `users` + `accounts` service in your system, but with a legal obligation for KYC (Know Your Customer — identity verification required by BACEN).

**Limits Engine**: before any transaction is processed, the limit is checked and reserved. It works like a traffic light: if no limit is available, the transaction never reaches the ledger. This prevents negative balance race conditions without needing a lock on the main table.

**Credit Engine**: evaluates score, grants credit and, when approved, instructs the ledger to disburse. It is the only engine that can create money in the customer's account without them having deposited first — hence the importance of its auditability.

**Boleto Engine**: generates the boleto, calculates the barcode, registers at CIP (Interbank Payment Chamber) and monitors payment. When the boleto is paid at another bank, CIP notifies the boleto engine, which instructs the ledger.

**Settlement Engine**: translates the internal intent ('transfer R$ 100 to so-and-so') into messages for BACEN (STR for TED, SPI for PIX). It is the bridge between the bank's internal world and the national financial system.

## Idempotency: the Central Problem in Money Systems

In any distributed system, partial failures happen: the network drops after you sent the request but before you received the response. In your email service, this means resending the email — annoying, but harmless. In money, it means **debiting twice**.

Idempotency is the property that guarantees: no matter how many times you process the same operation, the effect is the same as processing it once. The canonical implementation is simple: each transaction carries a unique `idempotency_key` (generated by the client, usually a UUID). Before processing, the ledger checks: 'have I seen this key before?' If yes, it returns the previous result without reprocessing. If no, it processes and persists the key together with the result.

This seems trivial until you think about event queues. When you use Kafka (or AWS MSK), the default semantics are **at-least-once delivery**: the message may be delivered more than once, especially during reprocessing after failure. If your consumer is not idempotent, you have a bug that only shows up under pressure — exactly the scenario of the Coinbase/AWS MSK incident in 2021, where a Kafka partition became unavailable and reprocessing caused accounting inconsistencies.

The golden rule: **never trust that a message arrived exactly once**. Design all financial consumers as idempotent. The `idempotency_key` must be persisted in the same database, in the same ACID transaction as the accounting entry — not in a separate cache.

## Consistency: what changes when the asset is money
| Criterion | Dimension | Typical E-commerce / SaaS | Banking system | Why the difference matters |
| --- | --- | --- | --- | --- |
| Data consistency | Eventual consistency acceptable | Strong consistency (ACID) on ledger | Balance stale for 200ms can cause double debit | — |
| Idempotency | Good practice, rarely critical | Mandatory in all consumers | Reprocessing without idempotency = real financial loss | — |
| Data rollback | DELETE or UPDATE are acceptable | Only reversal (new inverse entry) | Regulatory audit requires complete traceability | — |
| Availability vs Consistency (CAP) | Prefers availability (AP) | Prefers consistency (CP) in core | Bank unavailable is bad; bank inconsistent is catastrophic | — |
| Key cryptography | Software KMS usually sufficient | Physical HSM mandatory (PCI-DSS, BACEN) | Transaction keys must never be extractable by software | — |
| Audit | Logs useful, but not legally mandatory | WORM audit log, minimum 5-year retention (BACEN) | BACEN inspection may require reconstruction of any transaction | — |

## PIX Settlement Flow with Idempotency and Reconciliation

End-to-end flow of a PIX transfer: from the user's app to settlement at BACEN (SPI) and reconciliation. Highlights idempotency checkpoints and failure handling.

### 📱 Iniciação / Initiation

- Usuário App móvel (user)
- API Gateway mTLS + rate limit (edge)

### ⚙️ Processamento / Processing

- PIX Service orquestrador (compute)
- Idempotency Store idempotency_key → result (data)
- Motor de Limites reserva + validação (compute)
- Ledger Service Débito + Crédito interno (compute)

### 📨 Eventos / Events

- Event Bus (Kafka / MSK) (messaging)
- DLQ Reprocessamento c/ idempotência (messaging)

### 🏛️ Liquidação / Settlement

- Settlement Engine formata msg SPI (compute)
- SPI / BACEN Liquidação final (external)
- Conciliação compara extrato SPI × ledger (compute)
- Alerta Divergência time financeiro (compute)

### 🔐 Segurança / Security

- HSM assina msg SPI (security)
- Audit Log WORM S3 Object Lock (storage)

### Flows

- user_pix -> gw_pix: POST /pix {idempotency_key}
- gw_pix -> pix_svc: authenticated request
- pix_svc -> idem_store: 1. already processed?
- idem_store -> pix_svc: HIT → return cached
- pix_svc -> limit_check: 2. validate and reserve limit
- pix_svc -> ledger_pix: 3. ACID entry
- ledger_pix -> idem_store: persist key (same tx)
- ledger_pix -> kafka_pix: 4. publish TransactionCreated
- kafka_pix -> settle_svc: consume event
- kafka_pix -> dlq_pix: failure → DLQ
- dlq_pix -> settle_svc: reprocess (idempotent)
- settle_svc -> hsm_pix: 5. sign message
- settle_svc -> spi: 6. send to SPI
- spi -> recon_svc: 7. SPI statement (T+0)
- ledger_pix -> recon_svc: internal statement
- recon_svc -> alert: divergence detected
- ledger_pix -> worm_log: immutable log
- settle_svc -> worm_log: settlement log

## Resilience and Security: when a Bug Costs Real Money

In banking systems, the classic engineering triad — availability, consistency, partition tolerance (CAP) — carries different weight. A bank prefers to be **unavailable** rather than **inconsistent**. This is not dogma: it is regulation. BACEN can fine a bank for incorrect balances far more severely than for a maintenance window.

The Coinbase incident with AWS MSK (2021) is a didactic case: a Kafka partition became unavailable, the system automatically reprocessed events, and non-idempotent consumers generated duplicate entries. The problem was not Kafka — it was the absence of idempotency in the consumers. The lesson: **resilient infrastructure does not replace idempotent design**.

For security, the central concept is the **HSM (Hardware Security Module)**. Think of it as a physical safe that performs cryptographic calculations internally: you send the data, it returns the signed result, but the private keys never leave the hardware. This is required by PCI-DSS for card operations and by BACEN for message signing in the SPB. AWS CloudHSM and Azure Dedicated HSM are cloud options that maintain this guarantee.

Auditability is the other side of the coin: every entry, every credit decision, every limit change must be recorded in an **immutable audit log** (WORM — Write Once Read Many). On AWS, S3 Object Lock with Compliance Mode implements this. BACEN requires minimum 5-year retention for most transactional records.

## Descending the Elevator: from Strategy to Code

Gregor Hohpe describes the architect as someone who moves between the executive floor (strategy, regulation, product) and the technical floor (code, infrastructure, operations) — the 'Architecture Elevator'. This entire series was that journey.

**Part 1** went up to the executive floor: what a bank is, how BACEN regulates, what the SPB is. **Part 2** stopped at the product floor: payment rails, how PIX, TED and boleto work from the user's and business's perspective. This **Part 3** descended to the technical basement: the ledger, the engines, idempotency, HSM.

The insight that connects all three floors: **every business decision has a technical cost, and every technical choice has a business consequence**. BACEN requires 5-year auditability (business) → you need WORM storage (technical). PIX is real-time settlement (business) → you need idempotent consumers and automatic reconciliation (technical). Credit is money creation (business) → the credit engine needs strong consistency and an audit trail (technical).

For a dev or architect entering this domain: the biggest mistake is treating the banking system as 'just another microservice with money'. The constraints are different, the failure consequences are different, and the mental model needs to change before the code does.

## Where to start: mental checklist for the architect

- Learn double-entry before touching code: understand debit, credit and why the sum is always zero
- Treat idempotency as a functional requirement, not an optimization: every financial operation needs an idempotency_key
- Never use UPDATE on balance: use INSERT on ledger and derive balance as an aggregation
- Design event consumers as idempotent: at-least-once delivery is the reality of Kafka
- Strong consistency in the core, eventual consistency only in projections (reports, dashboards)
- Understand what settlement vs authorization means: authorizing is not settling; money only changes hands at BACEN

> **Architect's Perspective: what nobody tells you in the interview:** When I entered financial services coming from high-scale systems in other sectors, the biggest shock was not the technical complexity — it was the mindset shift around failure. In e-commerce, a consistency failure is a bug to fix. In banking, it is a regulatory incident with potential fines, BACEN notification and, in extreme cases, intervention. This changes everything: risk appetite, the deployment process, how you test, the obsession with idempotency.

My recommendation for anyone making this transition: before any line of code, spend time with the accounting team and the compliance team. Understand what an accounting entry is, what a reconciliation divergence is, what BACEN requires in terms of traceability. That context will make you take completely different — and much better — architecture decisions.

Second advice: do not underestimate legacy. Most banks have core systems with decades of operation. Your job is often not to replace them, but to create an abstraction layer that allows innovation at the edges without touching the core. Hohpe's Architecture Elevator is literal here: you will need to translate between the mainframe's COBOL and the new system's Kafka, and that translation has to be idempotent.

## Verdict: the key takeaway

A bank, seen from the inside, is a system of immutable events with strong consistency guarantees, mandatory idempotency and regulatory auditability — not a CRUD with a balance. The double-entry ledger is not a historical curiosity: it is the central invariant that guarantees money neither disappears nor appears from nowhere, and it maps directly to Event Sourcing. The engines (accounts, limits, credit, boleto, settlement, reconciliation) are domains with single responsibility; understanding their boundaries is the first step to designing or evolving a core banking system. Idempotency is not an optimization — it is the contract that enables safe reprocessing in a world of at-least-once delivery. And resilience here means preferring unavailability to inconsistency, because the regulatory cost of a wrong balance is orders of magnitude greater than the cost of a timeout. If you have absorbed these five principles, you already think like a financial systems architect — the rest is learning the details of each rail and each regulation, which Parts 1 and 2 of this series cover.

## References

- [Gregor Hohpe — The Software Architect Elevator](https://architectelevator.com/book/)
- [Martin Fowler — Event Sourcing Pattern](https://martinfowler.com/eaaDev/EventSourcing.html)
- [BACEN — Estabilidade Financeira e Regulação](https://www.bcb.gov.br/estabilidadefinanceira)
- [AWS — Building a Modern Core Banking Platform](https://aws.amazon.com/financial-services/banking/)
- [AWS CloudHSM — Hardware Security Module](https://aws.amazon.com/cloudhsm/)
- [Amazon S3 Object Lock — WORM Storage](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lock.html)
- [Martin Fowler — Accounting Patterns](https://martinfowler.com/ap2/index.html)

## Case sources

- [Gregor Hohpe — The Software Architect Elevator](https://architectelevator.com/book/)
- [Martin Fowler — Patterns: Event Sourcing / Ledger](https://martinfowler.com/eaaDev/EventSourcing.html)
- [BACEN — Estabilidade financeira e regulação](https://www.bcb.gov.br/estabilidadefinanceira)
- [AWS — Building a modern core banking platform](https://aws.amazon.com/financial-services/banking/)
