Who is Fernando F. Azevedo?

Fernando F. Azevedo is a Senior Solutions Architect at Banco Itaú with 16+ years of experience across AWS, event-driven architecture, DevSecOps, Data Mesh, AI and financial systems.

What technical topics does Fernando work with?

Fernando works with AWS, Kubernetes, Kafka, Data Mesh, Amazon Bedrock, RAG, DevSecOps, observability, financial systems and architecture communication using C4, ADRs and trade-off analysis.

Is Fernando available for professional conversations?

Fernando is currently building at Banco Itaú and is open to thoughtful conversations about architecture, cloud, AI, engineering leadership, community, podcasts and technical collaboration.

Decision (ADR)OLTP em alto crescimento (cenário)DadosAccepted

ADR: Aurora Sharding — App-Level vs Aurora Limitless vs Citus

Feb 18, 2026 9 min AI-assisted

Listen to study

generated on play

Generated only on first play

On demand

0:000:00

Speed

The MP3 is saved to S3 after the first play.

A high-growth OLTP workload exhausted the capacity of a single Aurora PostgreSQL writer. This ADR evaluates three sharding strategies — application-layer sharding, Aurora Limitless Database, and managed Citus/PostgreSQL — weighing operational complexity, cost, cross-shard query support, and migration risk.

When a single Aurora PostgreSQL writer is no longer sufficient, the decision of how to distribute load is not trivial. Each sharding approach carries a different contract with the engineering team: full control at high cognitive cost, a managed service with model constraints, or a database extension with portability trade-offs. This ADR documents the reasoning that would lead me to choose Aurora Limitless as the primary path — and when I would change my mind.

Scenario Context

System: High-growth OLTP platform (composite scenario, not a specific company)
Current engine: Aurora PostgreSQL 15, single writer, multiple read replicas
Problem scale: Writer saturated: CPU > 85% sustained, growing WAL lag, P99 write latency > 400ms
Estimated volume: ~50k transactions/second peak, main dataset ~4TB, 15% monthly growth
Access pattern: 80% of writes concentrated on main entity with natural shard key (tenant_id / account_id)
Constraints: 6-engineer data team; 99.99% SLA; maintenance window < 30 min/month
Relevant stack: AWS (us-east-1), Aurora PostgreSQL, Lambda, ECS, Terraform, Datadog
ADR status: Accepted — Aurora Limitless as primary decision

The Real Problem: Why Sharding Is the Last Option, Not the First

Before any conversation about sharding, it's mandatory to exhaust vertical and read-scaling alternatives. In this scenario, we've already done that: the writer is on db.r7g.16xlarge, read replicas absorb 100% of read queries via PgBouncer with session pooling, and indexes have been reviewed by a senior DBA. The problem is structural — write volume doesn't fit in a single node, and Aurora doesn't offer native horizontal write scaling in the classic model.

The diagnosis matters here. The bottleneck is not storage I/O — Aurora decouples storage from compute and the distributed storage layer scales automatically. The bottleneck is the WAL writer process and the lock manager on the compute node. Every committed transaction must pass through the writer, serialize the WAL, and wait for the storage node quorum ACK. With 50k TPS peak and average transactions of 5-8 statements, we're talking about hundreds of thousands of lock operations per second. This is a compute limit, not a storage limit.

The second dimension of the problem is growth. With 15% monthly growth, even if a vertical upgrade bought 6 months, we'd be back at the same point in under a year — with an even larger, more expensive instance and fewer escape options. Sharding isn't just about the present; it's about creating an architecture that scales with the business without requiring a new platform decision every growth cycle.

The third dimension is the team. Six data engineers with a 99.99% SLA means every unplanned oncall hour is expensive. The sharding choice cannot create an operational surface the team can't sustain. This immediately eliminates any solution that requires the team to implement their own rebalancing, shard metadata management, or cross-shard query routing from scratch.

The Forces at Play

Any sharding decision involves tensions that can't be resolved — only managed. I'll be explicit about each one:

Shard key and data distribution. We have tenant_id as a natural key. This is a huge advantage: most queries are scoped by tenant, meaning cross-shard queries are the exception, not the rule. But tenants aren't uniform — some have 100x more data than others. Any sharding strategy needs to handle hot shards from large tenants, either via sub-sharding or special routing.

Cross-shard queries. Even with 80% of queries scoped by tenant, the remaining 20% exist: aggregate reports, admin queries, cross-tenant joins in specific features. The question isn't whether cross-shard queries will happen, but what the cost of executing them is and who pays that cost — the application, the database, or the engineer who needs to rewrite the query.

Rebalancing. Shards aren't static. Tenants grow, new tenants arrive, tenants leave. The ability to move data between shards without downtime is critical. Solutions that require the team to implement this manually are a serious operational risk for a 6-person team.

PostgreSQL compatibility. The system uses PostgreSQL-specific features: JSONB, pg_trgm for partial full-text search, and some legacy stored procedures. Any solution needs to preserve this compatibility or the application migration cost explodes.

Cost. Aurora Limitless has a different pricing model than classic Aurora — you pay for Distributed Processing Units (DPUs) in addition to storage. For workloads that could be served by 2-3 independent writers with app-level sharding, Limitless cost may be 30-50% higher (estimate based on public AWS benchmarks, not direct measurement in this scenario). That delta needs to be justified by reduced operational complexity.

Vendor lock-in. Aurora Limitless is AWS-specific. Citus is open-source but has a managed version on Azure (Cosmos DB for PostgreSQL). App-level sharding with standard PostgreSQL is the most portable. For a system with 99.99% SLA and critical business dependency, portability has real value — but it's not free.

Evaluated Options

Option A: Application-Layer Sharding

Pros

Full control over routing logic and rebalancing
Maximum portability — any PostgreSQL works as a shard
Potentially lower infra cost (independent Aurora writers)
No dependency on experimental features or specific managed services

Cons

High implementation complexity: shard registry, connection routing, distributed transactions
Cross-shard queries must be implemented and maintained in the application (manual scatter-gather)
Shard rebalancing is the team's responsibility — serious operational risk
Schema migrations must be coordinated across all shards simultaneously
Reference: Figma and Notion took 12-18 months to stabilize their implementations

Viable for large teams with distributed systems expertise. Inadequate for a 6-engineer team with critical SLA.

Option B: Aurora Limitless Database

Pros

Transparent sharding managed by AWS — no shard registry in the application
Automatic shard rebalancing without downtime
Standard PostgreSQL interface — compatibility with existing drivers via Transaction Router
Horizontal write scaling without application architecture changes
Native integration with AWS ecosystem (IAM, CloudWatch, Parameter Groups)

Cons

Vendor lock-in: AWS-exclusive feature, no direct migration path to another provider
Additional DPU cost — may be 30-50% more expensive than manual sharding at moderate scale (estimate)
Documented limitations: some DDLs and PostgreSQL features not supported in distributed model
Relatively new feature (GA in 2024) — less battle-tested production history
Shard key must be defined at table creation — changing it later is complex

Best trade-off for this scenario: drastically reduces operational burden while maintaining PostgreSQL compatibility and write scalability.

Option C: Citus / Managed PostgreSQL (Azure or self-managed)

Pros

Open-source with mature, well-documented extension
Native support for distributed queries including cross-shard aggregations and joins
Greater portability than Limitless — can run on any cloud or on-prem
Shard rebalancing natively supported by the extension

Cons

Self-managed Citus: high operational burden — upgrades, HA, backups, rebalancing are team's responsibility
Managed Citus (Azure Cosmos DB for PostgreSQL): cloud provider change — high infra and operational migration cost
Coordinator/worker model introduces single point of failure in coordinator (mitigable with HA, but adds complexity)
Compatibility with some advanced PostgreSQL features may vary by version

Excellent technical option if the system is not heavily coupled to AWS or if portability is a strategic priority. Not the case here.

Architecture Decision

Accepted

Context

OLTP workload with saturated Aurora PostgreSQL writer, 6-engineer data team, 99.99% SLA, natural shard key available (tenant_id), and stack heavily integrated with AWS. Vertical scaling alternatives have been exhausted. The decision needs to balance write scalability, operational burden, and cost.

Decision

Adopt **Aurora Limitless Database** as the primary sharding strategy, with `tenant_id` as the shard key for high-write tables. Reference tables (lookup tables, configurations) will be defined as reference tables in Limitless to avoid cross-shard joins. Migration will be done via dual-write with consistency validation before cutover. Maintain classic Aurora as fallback for 90 days post-migration.

Consequences

✅ Horizontal write scaling without application logic changes — Limitless Transaction Router absorbs routing
✅ Automatic shard rebalancing eliminates the main source of operational risk in manual sharding
✅ Compatibility with existing PostgreSQL drivers — no DAL (Data Access Layer) refactoring
⚠️ Vendor lock-in increases: exiting Limitless requires migration to classic Aurora with manual sharding or Citus
⚠️ DPU cost needs monitoring — risk of non-linear cost growth with TPS increase
⚠️ Shard key (tenant_id) must be present in all write queries on distributed tables — queries without shard key will broadcast and be expensive

Why Not Application-Level Sharding: The Lesson from Figma and Notion

Figma and Notion are well-documented cases of application-layer sharding. Both worked — but both have data engineering teams of dozens of people, years of runway to stabilize the implementation, and the capacity to absorb the costs of a complex migration. What these cases teach is not that application-level sharding is a good idea for everyone — it's that it's a good idea for those who have the resources to do it correctly.

For a 6-engineer team with a 99.99% SLA, the real cost of application-level sharding is not the initial routing code. It's what comes after: the first hot shard that appears at 2am and needs to be manually rebalanced; the schema migration that needs to run across 8 shards in a coordinated fashion with tested rollback; the reporting query someone wrote without a shard key that's now doing a full scan across all shards; the bug in the shard registry that sends 0.1% of writes to the wrong shard and is only discovered 3 days later.

These are not hypothetical problems. They are the problems that invariably appear in manual sharding implementations, and the question is whether the team has the capacity to solve them without compromising the SLA. The honest answer, for a 6-person team without prior experience in distributed sharding, is no.

This doesn't mean application-level sharding is wrong. It means it has an organizational maturity cost that needs to be acknowledged. If the team grows to 15-20 engineers with specific distributed systems experience, the equation changes — and the portability of manual sharding becomes a real asset. Today, it's not.

Target Architecture: Aurora Limitless with Tenant-Based Routing

Write/read flow in production after migration to Aurora Limitless. The Transaction Router is the single entry point — the application doesn't know about shard groups. Reference tables are replicated across all shard groups to avoid cross-shard joins.

🖥️ Application Layer

Application · (ECS Tasks)
PgBouncer · Connection Pool

🔀 Aurora Limitless — Router Layer

Transaction Router · (Limitless Endpoint)

🗄️ Aurora Limitless — Shard Groups

Shard Group 1 · tenant_id: 0-33%
Shard Group 2 · tenant_id: 33-66%
Shard Group 3 · tenant_id: 66-100%
Reference Tables · (replicated all shards)

📊 Observability

CloudWatch · DPU / Latency / Errors
Datadog · APM + DB Metrics

🔒 Security

IAM Auth · DB Credentials via Secrets Manager

Detailed Technical Comparison

	Criterion	App-Level Sharding	Aurora Limitless	Managed Citus
Operational burden	High	Low	Medium	—
Cross-shard queries	Manual (scatter-gather in app)	Supported via Transaction Router	Natively supported (Citus planner)	—
Rebalancing	Manual — team's responsibility	Automatic by AWS	Supported by Citus extension	—
PostgreSQL compatibility	Full (standard PostgreSQL)	High, with documented DDL limitations	High, with limitations on some complex queries	—
Portability	Maximum	Minimal (AWS-only)	High (open-source)	—
Relative cost (estimate)	Lower (independent Aurora writers)	Higher (additional DPUs)	Medium (depends on managed vs self-hosted)	—
Feature maturity	High (industry standard)	Medium (GA 2024)	High (Citus exists since 2012)	—
Time to production (estimate)	12-18 months to stabilize	3-6 months (migration + validation)	6-9 months (if cloud change)	—

Migration Plan to Aurora Limitless

1
Phase 1 — Audit (Weeks 1-3)
Map all queries that don't include tenant_id. Identify stored procedures with DDL incompatible with Limitless. Catalog candidate reference tables. Measure data distribution by tenant to identify hot tenants.
2
Phase 2 — Limitless Environment (Weeks 4-6)
Provision Aurora Limitless cluster in staging environment. Create distributed tables with tenant_id as shard key. Create reference tables for lookup data. Run load tests with shadow traffic.
3
Phase 3 — Dual-Write (Weeks 7-10)
Implement dual-write in DAL: all writes go to both classic Aurora AND Aurora Limitless. Reads continue on classic Aurora. Run continuous consistency validation between both systems. Monitor divergences.
4
Phase 4 — Gradual Cutover (Weeks 11-14)
Migrate reads to Aurora Limitless by tenant (starting with smaller tenants). Monitor P99 latency and error rate. Gradually increase percentage. Full cutover when 100% of tenants validated.
5
Phase 5 — Stabilization (Weeks 15-24)
Maintain classic Aurora as fallback for 90 days. Monitor DPU cost vs projection. Disable dual-write. After 90 days without incidents, deprecate classic Aurora and release resources.

My Perspective: The Invisible Cost of Complexity

Senior Solutions Architect

When I analyze this type of decision, the first question I ask is not technical — it's organizational: who will wake up at 3am when this breaks, and what will they need to know to fix it? Application-level sharding is technically elegant when done well. But 'done well' requires a level of operational maturity that most teams underestimate. The routing code is the easy part. The hard part is rebalancing hot shards in production with an active SLA, coordinating schema migrations across multiple shards, and debugging inconsistencies that only appear in cross-shard queries during peak hours. Aurora Limitless solves exactly these problems — at the cost of vendor lock-in and additional DPUs. For a 6-engineer team with a 99.99% SLA, that's a trade-off I would accept without hesitation. DPU cost is predictable and monitorable; the cost of a poorly implemented sharding incident at 3am is not. What I would do differently from what's documented here: I would add an explicit circuit breaker in the Transaction Router — if Limitless returns errors above a threshold, the application automatically falls back to classic Aurora (which remains in dual-write for 90 days). This isn't paranoia; it's acknowledging that Limitless has been GA since 2024 and that any new database feature will have edge cases that only appear in production. On Citus: it's a genuinely good option that I would choose if the system were not heavily coupled to AWS or if there were a strategic portability requirement. The fact that the managed version is on Azure is a real problem for an AWS-first system — not impossible to solve, but it adds a layer of operational complexity that isn't justified here. The general lesson: sharding is a team decision, not just an architecture one. The best technical solution that the team can't operate is worse than a good-enough solution that the team masters.

AWS Well-Architected: Decision Impact

Security

IAM authentication for the Limitless endpoint via Secrets Manager. Encryption at rest and in transit inherited from Aurora. The Transaction Router is the single entry point — reduces attack surface vs multiple independent writers in the app-level sharding model.

Reliability

Aurora Limitless inherits Aurora's distributed storage durability (6 copies across 3 AZs). Automatic shard rebalancing reduces the risk of hot shards causing service degradation. The dual-write plan with fallback to classic Aurora ensures RTO < 5 minutes in case of regression.

Performance efficiency

The Transaction Router distributes writes across shard groups based on tenant_id hash, eliminating the single writer bottleneck. Reference tables avoid cross-shard joins for lookup data. Queries without shard key do broadcasts — they need to be monitored and eliminated.

Sustainability

Consolidation of compute in Limitless vs multiple independent writers may reduce total resource consumption if Limitless is more efficient in DPU allocation per TPS. Requires validation with real benchmarks.

Verdict

Aurora Limitless is the correct decision for this scenario — not because it's the most technically elegant solution, but because it best aligns team capacity, SLA constraints, and delivery speed. Application-level sharding is a legitimate solution for large teams with distributed systems expertise and time to stabilize the implementation. For a 6-engineer team with a 99.99% SLA, the operational cost of manual sharding is prohibitive — not in money, but in risk and cognitive capacity. Citus is technically excellent, but the managed version is on Azure, and moving an AWS-first system to another cloud provider to solve a database problem is a trade-off that doesn't make sense here. The real risks of the decision are two: DPU cost may be non-linear with TPS growth (monitor from day one), and Limitless has been GA since 2024 (maintain the fallback to classic Aurora for at least 90 days post-cutover). Neither of these risks invalidates the decision — they need to be managed. The sharding decision is, ultimately, a decision about where you want complexity to live: in application code, in the managed service, or in the database extension. For this scenario, the managed service is the right answer.

References

Amazon Aurora Limitless Database — Official Documentation Amazon Aurora — Product Page

#aurora#sharding#postgresql#oltp#aurora-limitless#citus#data-platform#scalability

Case sources

Amazon Aurora — Limitless Database AWS — Amazon Aurora

Written with AI assistance from the public case and my architect's reading.

Listen to study

Scenario Context

The Real Problem: Why Sharding Is the Last Option, Not the First

The Forces at Play

Evaluated Options

Option A: Application-Layer Sharding

Option B: Aurora Limitless Database

Option C: Citus / Managed PostgreSQL (Azure or self-managed)

Architecture Decision

Why Not Application-Level Sharding: The Lesson from Figma and Notion

Target Architecture: Aurora Limitless with Tenant-Based Routing

Detailed Technical Comparison

Migration Plan to Aurora Limitless

Phase 1 — Audit (Weeks 1-3)

Phase 2 — Limitless Environment (Weeks 4-6)

Phase 3 — Dual-Write (Weeks 7-10)

Phase 4 — Gradual Cutover (Weeks 11-14)

Phase 5 — Stabilization (Weeks 15-24)

AWS Well-Architected: Decision Impact

Security

Reliability

Performance efficiency

Sustainability

Verdict

References