Back to site
Architecture Studies

Architecture studies

Architecture documents from real cases — ADRs, design docs, post-mortem analyses and teardowns — with my reading as a solutions architect.

Design Doc / RFC
11m
Read
1
Sources
2026
Case
Featured study
Design Doc / RFCZero Trust (cenário)

Design Doc: Zero Trust on AWS for Internal Service Access

This document proposes a Zero Trust architecture on AWS where identity, context, and device posture replace the network perimeter as the primary access control mechanism. The design covers workload segmentation, adaptive access via IAM Identity Center and Verified Access, and continuous audit instrumentation. The goal is to eliminate implicit trust based on network location without introducing excessive operational friction.

Feb 10, 2026 11 minSegurança
Read full study
Design Doc / RFC
10m
Read
1
Sources
2026
Case
Design Doc / RFCEnterprise RAG (cenário)

Design Doc: Enterprise RAG Platform with Continuous Evaluation and Guardrails on Bedrock

This document describes the architecture of an enterprise RAG platform built on Amazon Bedrock, covering semantic retrieval, continuous quality evaluation, safety guardrails, and cost control. The design prioritizes traceability, operability, and risk containment in regulated environments, without sacrificing acceptable end-user latency.

RAGBedrockguardrailsevaluationenterprise-ai
Feb 5, 2026 10 minIA
Design Doc / RFC
10m
Read
1
Sources
2026
Case
Design Doc / RFCPayments API (cenário)

Design Doc: Multi-Region Active-Active Payments API

This document proposes a multi-region active-active architecture for a critical payments API, targeting near-zero RTO/RPO, deterministic conflict resolution in data replication, and a phased rollout that minimizes operational risk. The design is grounded in real financial engineering principles and AWS patterns, with explicit trade-offs between consistency, latency, and cost.

multi-regionactive-activepaymentsresiliencerto-rpo
Feb 1, 2026 10 minResiliência
Decision (ADR)
7m
Read
2
Sources
2026
Case
Decision (ADR)Order processing (cenário)

ADR: EventBridge vs Kafka/MSK for Order Processing

This ADR evaluates EventBridge and Amazon MSK as the event backbone for an order processing system, weighing throughput, ordering, replay, and operational burden. The decision is grounded in real trade-offs between managed simplicity and platform control, with direct consequences on cost, operability, and delivery guarantees.

event-driveneventbridgekafkamskorder-processing
Jan 25, 2026 7 minEvent-driven
Decision (ADR)
8m
Read
2
Sources
2026
Case
Decision (ADR)Core banking (cenário)

ADR: Aurora vs DynamoDB for a Double-Entry Ledger in Core Banking

This ADR evaluates Aurora PostgreSQL and DynamoDB as the persistence engine for a double-entry ledger in a core banking system, weighing strong consistency, access patterns, auditability, and cost. The decision favors Aurora with date-range partitioning and an immutable event layer, acknowledging the horizontal scaling constraints that choice imposes.

auroradynamodbledgercore-bankingdouble-entry
Jan 20, 2026 8 minDados
Decision (ADR)
9m
Read
1
Sources
2026
Case
Decision (ADR)Fintech (cenário)

ADR: Modular Monolith vs Microservices in a Greenfield Fintech

An early-stage fintech faces the classic architecture decision: go straight to microservices or build a modular monolith first. This ADR examines the real forces at play — team size, validation speed, blast radius, and operational cost — and records the decision with its concrete consequences.

architecturefintechmonolithmicroservicesadr
Jan 15, 2026 9 minArquitetura
Teardown
7m
Read
1
Sources
2024
Case
TeardownFigma

Figma: Horizontal Postgres Sharding Without Stopping Growth

In 2022, Figma hit the physical limits of its monolithic Postgres database and executed a horizontal sharding migration using key-based partitioning, dynamic routing, and incremental data movement — all without downtime and without halting product growth. This teardown reconstructs the architecture, analyzes the technical decisions, and points out what I would do differently.

postgresshardingdata-platformdatabasescalability
Mar 14, 2024 7 minDados
Teardown
7m
Read
1
Sources
2023
Case
TeardownDiscord

Discord: How to Store Trillions of Messages — Cassandra → ScyllaDB Migration Teardown

Discord migrated its message storage from Apache Cassandra to ScyllaDB, eliminating unpredictable tail latencies and GC pauses that affected millions of users. This teardown reconstructs the architecture, examines the engineering decisions and trade-offs involved, and presents my critical read of what was done well — and what I would do differently.

discordscylladbcassandrarustdata-platform
Mar 6, 2023 7 minDados
Post-mortem
11m
Read
1
Sources
2021
Case
Post-mortemRoblox

Roblox 2021: 73 Hours of Downtime, Consul and the Load Effect

In October 2021, Roblox suffered 73 consecutive hours of unavailability — the largest outage in the platform's history. The root cause was a combination of BoltDB contention (Consul's backend) amplified by a newly enabled telemetry streaming feature during a period of elevated traffic. This post-mortem reconstructs the failure chain, analyzes the infrastructure decisions involved, and extracts lessons applicable to any platform relying on service mesh and distributed coordination.

postmortemconsulservice-meshresiliênciaboltdb
Oct 28, 2021 11 minResiliência