Idempotency Keys for SaaS APIs: Making Retries Safe Across Payments and Provisioning
Every HTTP client retries. Networks time out, load balancers reset connections, mobile clients reconnect after a brief loss of signal. In a read-heavy API this is mostly fine - fetching the same resource twice returns the same data. But in a write-heavy SaaS, retries without idempotency keys create a specific category of production incident that keeps engineering teams awake at night: the customer gets charged twice, two workspaces get provisioned for the same signup, or a hundred welcome emails go out to a single address.
Idempotency keys are the standard solution. They are also one of the most consistently under-implemented features in early-stage SaaS products - until the first incident report arrives. This post covers how to implement them correctly across the operations where they matter most.
What an Idempotency Key Actually Is
An idempotency key is a client-generated unique identifier attached to a mutating request. When your API receives a request with a key it has already processed, it returns the original response instead of executing the operation again. The client gets the outcome it expected, and your system does not duplicate work.
The concept is straightforward. The implementation has a number of details that are easy to get wrong.
Most APIs accept idempotency keys as an HTTP header - Idempotency-Key is the de facto standard name. The client generates a UUID (or any globally unique string) before the first attempt and reuses the same key on every retry. Your server stores the key alongside the result of the first successful execution and serves that cached result on subsequent requests with the same key.
Keys typically expire after 24 to 48 hours - long enough to cover any reasonable retry window, short enough that your storage does not grow unbounded.
Where Exactly-Once Semantics Actually Matter
Not every endpoint needs idempotency keys. GET requests are already safe to retry. Even some POST requests are naturally idempotent if they are structured as upserts. The operations where a missing idempotency key creates real business risk fall into three categories.
Payment charges. Charging a customer is the most obvious case. A payment processor call that times out at the HTTP layer may have already succeeded at the processor side. If your client retries without an idempotency key, you will charge twice. Stripe, Adyen, and most modern payment processors accept idempotency keys natively - but you still need to pass them, and you need to store which key maps to which charge so you can handle disputes and refunds correctly.
Account and workspace provisioning. SaaS signup flows frequently involve cascading writes: create a user record, provision a workspace, set up default permissions, trigger a welcome email, create a Stripe customer. If any step in this chain fails and the client retries the whole flow, you can end up with duplicate users or duplicate workspaces. The idempotency key should cover the entire compound operation, not just individual steps.
Webhook delivery. Webhook consumers are API endpoints that receive events from external systems. Every serious webhook provider - Stripe, GitHub, Twilio - delivers with at-least-once guarantees, meaning duplicate delivery is normal, not exceptional. Your webhook handler must be idempotent. The event ID included in the payload serves as your idempotency key: store it on first receipt and discard duplicates.
The Server-Side Implementation Pattern
At the core, the pattern is a check-then-execute with atomic storage. Before processing any mutating request:
- Hash or normalize the idempotency key (trim whitespace, lowercase if your key format allows it).
- Look up the key in your idempotency store.
- If found, return the stored response directly - do not re-execute.
- If not found, acquire a lock on the key, execute the operation, store the result, release the lock.
Step 4 deserves attention. Between the lookup (step 2) and the lock (step 4), a concurrent request with the same key can arrive. Without a lock, both requests will find the key absent and both will execute the operation. The lock must be acquired before execution begins, not after it completes. A database row-level lock or a distributed lock (Redis SET NX PX) both work, depending on your infrastructure.
The stored result should include the HTTP status code, response headers, and response body. When you replay a result, replay all three - not just the body. Clients that inspect status codes for branching logic will behave incorrectly if they receive a replayed body with a different status.
One detail that is often omitted: if the first execution fails with a 4xx error (invalid input, for example), the key should not be stored - or should be stored with a short TTL. 4xx responses indicate a client error, and the client may legitimately want to fix the request and retry with the same key. Storing a 4xx result permanently would cause the corrected request to fail with a stale error. 5xx failures, on the other hand, may indicate partial execution, and the safest behavior is to treat the key as "in progress" until the operation is confirmed complete or rolled back.
Handling Concurrent Requests with the Same Key
A subtlety that bites teams implementing this for the first time: what should you return if a request arrives with a key that is currently being processed by another thread?
The two main options are to return HTTP 409 (Conflict) or to wait and then return the result once the first execution completes. The 409 approach is simpler to implement but requires the client to retry after a short delay, which adds complexity on the client side. The wait-and-return approach is friendlier to clients but requires a timeout, since the in-flight execution might fail and never complete.
Stripe's implementation returns 409 for in-flight keys. For most SaaS use cases, that is a reasonable default - it keeps the server-side implementation simple and the client retry logic is straightforward.
Idempotency Keys in Background Jobs
The same problem exists in background job systems, and it is less commonly addressed. When a job queue delivers a task, it typically guarantees at-least-once delivery. A worker that crashes mid-execution will have the job re-queued and re-delivered. If the job performs external writes - charging a card, sending an email, provisioning infrastructure - the re-execution can cause duplicates.
The pattern here is the same: use a stable, job-specific key to guard external side effects. The job's own ID is a natural choice. Before each external write, check whether this job ID has already produced that write. If yes, skip it. If no, execute and record.
This is especially important in payment flows triggered by background jobs. A charge job that gets re-queued after a crash should detect that the charge already completed and skip re-charging - not re-execute the payment call. Storing the external transaction ID alongside the job ID gives you the data you need to answer that question reliably.
Scoping Keys to Users and Operations
A key sent by one user should not be reusable by a different user for a different operation. Without scoping, a malicious client could potentially replay another user's operations by guessing or reusing their key.
Scope idempotency keys to (user_id, operation_type, idempotency_key) at minimum. Some teams also scope by API version, since a key that was valid against v1 of an endpoint might mean something different against v2.
In practice, the simplest scoping approach is to store keys in a table with a compound unique index on (tenant_id, idempotency_key), and reject any request where the key is presented by a different tenant than the one who first used it.
Practical Implementation Advice
If you are adding idempotency keys to an existing SaaS codebase rather than building from scratch, start with the two or three endpoints that cause the most pain when retried - usually charge endpoints and provisioning endpoints. Add a generic idempotency middleware layer once you have validated the pattern in one place, then extend it broadly.
Use a dedicated table (or Redis hash) for idempotency records, not your main application tables. Idempotency records have different retention and indexing requirements, and mixing them with business data makes cleanup harder.
Set up monitoring for idempotency key hits. A high replay rate on a particular endpoint often signals a client that is retrying excessively due to a timeout misconfiguration on their end - worth investigating proactively rather than discovering during a billing dispute.
If you need a review of how your current API handles retries, or if you are designing a SaaS API from scratch and want to get the reliability patterns right from the start, get in touch at hello@wolf-tech.io. We help SaaS teams at wolf-tech.io build APIs that stay correct under the network conditions that real clients actually experience.
The Broader Principle
Idempotency keys are part of a broader discipline: designing distributed systems that behave correctly not just when everything works, but when things partially fail. Networks are unreliable. Clients always retry. The question is whether your system handles that gracefully or whether it leaves customers with duplicate charges and support teams with manual cleanup work.
For SaaS APIs that handle money or provisioning, implementing idempotency keys is not an advanced optimization - it is a basic reliability requirement. The implementation work is modest. The failure mode it prevents is not.
If your current API handles payment processing or account provisioning, it is worth auditing whether your idempotency story is solid. The custom software development work we do at Wolf-Tech regularly surfaces this gap in SaaS codebases that have scaled past their initial architecture. It is one of those things that is easy to add early and painful to retrofit after the first production incident.

