# Stripe: API Design and Idempotency Keys

An in-depth technical analysis of how Stripe solved the problem of safe retries in payment APIs using idempotency keys, request state storage, and rate limiting — design decisions that became an industry reference.

- URL: https://fernando.moretes.com/studies/stripe-api-idempotency

- Markdown: https://fernando.moretes.com/studies/stripe-api-idempotency/study.md?lang=en

- Type: Teardown

- Company: Stripe

- Domain: API/Pagamentos

- Date: 2017-02-22

- Tags: api-design, idempotency, payments, stripe, distributed-systems, rate-limiting, reliability, fintech

- Reading time: 8 min

---

Every payment API faces a fundamental problem: networks fail, timeouts happen, and the client never knows whether the charge was processed or not. Stripe built an elegant and rigorous solution to this problem — and documented it publicly. This teardown reconstructs the architecture, examines the design decisions, and evaluates where the choices make sense and where I'd do things differently.

## Fact Sheet

- **Company:** Stripe
- **Domain:** Payments API / Financial Infrastructure
- **Original post published:** 2018 (Stripe Engineering Blog)
- **Core stack:** Ruby, REST API, Redis (idempotency state), PostgreSQL, HTTPS/TLS
- **Scale:** Hundreds of billions of dollars processed annually; millions of requests/day from integration partners
- **Core problem:** Ensuring a payment request executed more than once produces exactly the same effect as executing it once
- **Key mechanism:** Idempotency key sent by client in HTTP header, persisted on the server with associated result

## The Problem: Money in Transit and Unreliable Networks

Payments are transactions with real and asymmetric consequences. If you charge a customer twice for a single order, you've created a serious business problem — refunds, chargeback disputes, erosion of trust. If you don't charge when you should, you've lost revenue. The problem isn't theoretical: in distributed systems, message delivery has three possible states — delivered, not delivered, or **unknown**. It's that third state that kills naive payment systems.

The classic scenario: a client makes a `POST /charges` request to create a charge. The request reaches Stripe's server, processing begins, the charge is created on the card network — and then the connection drops before the response reaches the client. The client received a timeout. What should it do? If it simply retries the request, it may charge the user twice. If it doesn't retry, it may lose the sale and leave the user without the service they paid for.

This problem is amplified by Stripe's context: it is an infrastructure platform. Its clients are developers and companies building products on top of the API. Any solution needs to work **reliably and predictably** for thousands of different integrations, many written by engineers who are not distributed systems experts. The solution cannot depend on the client doing something sophisticated — it needs to be simple to use correctly and hard to use wrong.

## Reconstructed Architecture: Idempotency Flow in the Stripe API

Flow of a POST /charges request with idempotency key, including success, duplicate, and failure paths.

### 👤 Client

- API Client (SDK / HTTP) (user)

### 🌐 Edge / Auth

- TLS Termination + Auth (edge)
- Rate Limiter (per key / IP) (security)

### ⚙️ API Layer

- API Server (charge handler) (compute)
- Idempotency Middleware (compute)

### 🗄️ State Store

- Redis (idempotency keys + locks) (data)
- PostgreSQL (charges, events, result payload) (storage)

### 💳 External

- Card Network (Visa/Mastercard) (external)

### 📬 Async

- Job Queue (webhooks / events) (messaging)
- Webhook Delivery (retry w/ backoff) (compute)

### Flows

- client -> tls: POST /charges
Idempotency-Key: uuid
- tls -> ratelimit: authenticated
- ratelimit -> idem_check: within limit
- idem_check -> redis: lookup key
(GET idem_key)
- redis -> idem_check: HIT → return
cached result
- idem_check -> api: MISS → process
new request
- api -> redis: SET lock
(atomic)
- api -> postgres: persist charge
+ idem record
- api -> cardnet: card
authorization
- cardnet -> api: approved/declined
- api -> postgres: update result
+ final status
- api -> redis: SET result
(TTL 24h)
- api -> queue: enqueue event
charge.succeeded
- queue -> webhook: async delivery
- webhook -> client: POST webhook
(retry backoff)
- ratelimit -> client: 429 Too Many
Requests

## How It Works: The Mechanics of Idempotency Keys

Stripe's solution is elegant in its apparent simplicity, but there is considerable depth in the implementation details.

**The basic contract**: the client generates a unique UUID per *operation intent* and sends it in the `Idempotency-Key` header. If the same key is sent again within a 24-hour window, the server returns exactly the same result as the first execution — without reprocessing, without charging again. The key is bound to the pair `(api_key, idempotency_key)`, not just the idempotency key in isolation, which prevents collisions between different accounts.

**The server flow**: upon receiving a request, the idempotency middleware performs an atomic `GET` in Redis for the composite key. If it finds a complete result, it returns immediately with the original HTTP status and original payload — the client cannot distinguish whether it was the first execution or a retry. If it finds nothing, it acquires a distributed lock (via `SET NX` in Redis) to ensure that concurrent requests with the same key do not process in parallel — this is critical to avoid race conditions where two simultaneous processes try to create the same charge.

**Intermediate states**: the system needs to handle the case where a request is *in processing* when a second one arrives with the same key. Stripe returns a `409 Conflict` in this case, signaling to the client that the original operation is still in progress. This is semantically correct and prevents the client from interpreting the 409 as a business failure.

**Result persistence**: when processing completes — successfully or with a business error (card declined, for example) — the result is persisted both in PostgreSQL (as part of the charge record) and in Redis with a 24-hour TTL. Business errors are also idempotent: if the first attempt resulted in `card_declined`, the second attempt with the same key returns the same `card_declined` without retrying on the card network. This is an important and non-obvious design decision — I'll discuss it in the callout.

**Retries with exponential backoff**: Stripe's official SDK implements automatic retries with exponential backoff and jitter for requests that return 5xx or network timeout. The idempotency key is generated once and reused across all retries of the same operation. This transforms the problem of 'how to guarantee exactly one execution' into 'how to guarantee at least one execution with an idempotent result' — a fundamental shift in perspective.

## Rate Limiting: The Protection Layer That Completes the Picture

Idempotency solves the safety problem of retries, but it doesn't solve the volume problem. A client with a bug can send thousands of requests per second — with or without correct idempotency keys. Rate limiting is the layer that protects the infrastructure and ensures fairness between clients.

Stripe implements rate limiting across multiple dimensions: per API key (account), per endpoint, and globally. Per-endpoint granularity is important: a read endpoint like `GET /charges` has more generous limits than a write endpoint like `POST /charges`, because the processing cost and risk of side effects are different.

The most common technical mechanism for rate limiting at high scale is the **token bucket** or **sliding window counter** in Redis. Stripe doesn't publicly document which algorithm it uses, but the observable behavior — a limit of 100 requests per second per API key for most endpoints in production — is consistent with a sliding window. The `Retry-After` response header in 429 responses is an important design signal: it tells the client *when* it can try again, transforming an error into actionable information.

A critical detail in the interaction between idempotency and rate limiting: a request blocked by rate limiting **does not consume the idempotency key**. This is semantically correct — the client was prevented from trying, so the key remains available for when it can try again. If the rate limiter consumed the key, the client would be stuck: blocked from trying and with the key 'burned' for that operation.

Stripe also documents the practice of **idempotency in webhooks**: the webhook delivery system may deliver the same event more than once (at-least-once delivery), and clients are instructed to use `event.id` as an idempotency key in processing. This closes the loop: the API is idempotent on input, and the notification system is idempotent on output.

## Decision Matrix: Core Design Trade-offs

### Server-managed idempotency (Stripe approach)

**Pros**
- Works with any HTTP client — no special logic needed beyond generating and reusing a UUID
- Deterministic result regardless of the number of retries
- Protects against client bugs (retry loops, aggressive reconnections)

**Cons**
- Requires state storage on the server (Redis + PostgreSQL) — operational and latency cost
- 24h TTL creates an inconsistency window if the client loses the original UUID
- Idempotent business errors can surprise developers (card_declined is not retryable with the same key)

**Verdict:** Correct choice for an infrastructure platform with thousands of integrators of varying sophistication levels

### Client-managed idempotency (alternative design)

**Pros**
- No additional state on the server
- More flexible for sophisticated clients who want full control

**Cons**
- Requires each integrator to implement deduplication logic correctly — source of production bugs
- Impossible to audit or debug on the server side
- Does not scale as a platform pattern

**Verdict:** Inadequate for a public payments API — transfers complexity to where it causes the most damage

### Fully idempotent operations by design (e.g., PUT semantics)

**Pros**
- No idempotency key needed — the operation itself is safe to repeat
- Simpler mental model for configuration operations (e.g., updating customer data)

**Cons**
- Impossible for charge creation — each charge is a new financial intent, not a state update
- Requires different domain modeling (resources as state vs. commands as events)

**Verdict:** Complementary, not a substitute — suitable for resource endpoints, not transaction endpoints

## AWS Well-Architected Framework Read

- **security**: **Strong.** The idempotency key is bound to the API key, preventing a malicious actor from reusing another client's keys. TLS mandatory on all calls. Rate limiting per API key functions as abuse control beyond infrastructure protection. One concern: idempotency keys are sent in HTTP headers — if logged carelessly, they can leak into observability systems. Stripe does not explicitly document how it handles this.
- **reliability**: **Excellent — it is the central pillar of the design.** The system transforms network failures into recoverable events without risk of duplication. The combination of distributed lock (Redis NX) + result persistence (PostgreSQL) guarantees exactly-once semantics from a business effect perspective, even when the network is unreliable. The 409 Conflict for concurrent requests is a conscious reliability choice — it prefers to reject rather than process in parallel.
- **performance**: **Good with conscious trade-offs.** Redis as an idempotency result cache adds an extra network round-trip on the happy path (lookup before processing), but the cost is justified by the benefit. For repeated requests (retries), the Redis short-circuit avoids the entire processing chain — including the card network call, which is the most expensive operation. The 24h TTL is a balance between retry window coverage and memory usage.
- **cost**: **Efficient for the problem solved.** The incremental cost of Redis for storing idempotency keys is marginal compared to the cost of processing duplicate charges — both in infrastructure and business consequences (chargebacks, support). The design avoids unnecessary reprocessing on the card network, which has a per-transaction cost.
- **sustainability**: **Positive.** By avoiding unnecessary reprocessing — especially card network calls — the design reduces wasted computation. The Redis short-circuit for retries is more energy-efficient than reprocessing the full stack.

## The Non-Obvious Decision: Business Errors Are Idempotent

The most interesting — and most debatable — decision in Stripe's design is that **business errors are also idempotent**. If you send a request with key `idem_abc123` and the card is declined (`card_declined`), a second request with the same key returns the same `card_declined` without retrying on the card network.

The logic behind this is coherent: the idempotency key represents a *specific charge intent*. If that intent resulted in a decline, the decline is the correct result for that intent. If the client wants to try again — perhaps the user updated their card, or wants to try a different card — that is a *new intent*, which should have a new idempotency key.

This has an important consequence for payment flow design: the client needs to understand the difference between **retryable errors** (5xx, network timeout) and **non-retryable errors** (business errors like `card_declined`, `insufficient_funds`). For retryable errors, reuse the key. For business errors, generate a new key if you want to try again under different conditions.

This distinction is powerful but requires educating integrators. Stripe invests heavily in documentation and SDKs that encapsulate this logic — the Python SDK, for example, automatically retries only on network errors and 5xx, never on 4xx. This is platform design: the correct architectural decision implemented in a way that the default behavior is the safe behavior.

> **What I'd do differently — and what I'd carry into any financial system:** Stripe's design is solid and I'd endorse most choices without hesitation. But there are three points where I'd think differently:

**1. Configurable idempotency TTL by operation type.** 24 hours is a reasonable choice for most cases, but in financial systems with specific processing windows (e.g., intraday settlement, batch closing), a fixed TTL can be too short or too long depending on context. I'd expose a mechanism for the client to declare the expected retry window — or internally, I'd vary the TTL by endpoint type and operation criticality.

**2. Explicit separation between 'processing lock' and 'result cache'.** Redis serves two purposes in the design: distributed lock during processing and result cache after. Mixing these two roles in the same store creates operational coupling — a Redis failure during processing can leave orphaned locks. In high-criticality systems, I'd use separate stores with different TTLs and eviction policies, or use PostgreSQL as the source of truth for idempotency state and Redis only as a fast-read cache.

**3. Idempotency observability as a first-class citizen.** The design doesn't explicitly document idempotency metrics — hit/miss rate, retry distribution per key, time between first attempt and successful retry. In any financial system I operate, these metrics are front-line dashboards. An increase in the retry rate is an early signal of network degradation or bugs in integrators — you want to know this before it becomes an incident.

## Verdict

Stripe's idempotency design is a case study in solving a genuinely difficult problem in a way that makes the result seem obvious in retrospect — the most reliable signal of good engineering design. The combination of server-managed idempotency keys, distributed locking for concurrency, result persistence including business errors, and SDKs that encapsulate the correct behavior by default creates a system that is simultaneously robust for Stripe's infrastructure and safe for integrators of all sophistication levels.

What makes this design especially valuable as a study is that it is not specific to payments. The same principles apply to any operation with side effects in distributed systems: email sending, resource provisioning, financial transfers, order creation. Idempotency is not an API feature — it is a system property that needs to be designed from the start.

The points where I'd diverge — configurable TTL, store separation, first-class observability — are refinements for contexts of higher criticality or greater operational complexity. For a public API with thousands of integrators, Stripe's choices of simplicity and safe defaults are the correct ones.

## References

- [Stripe — Designing robust and predictable APIs with idempotency](https://stripe.com/blog/idempotency)

## Case sources

- [Stripe — Designing robust and predictable APIs with idempotency](https://stripe.com/blog/idempotency)