Usage-Based Billing Engineering: Metering, Invoicing, and the Edge Cases That Bite
Usage-based billing engineering sounds straightforward until you are three months into it and your finance team is asking why 4% of invoices do not match the numbers in your dashboard. The model itself is simple: customers pay for what they use. The engineering is not. Event ingestion pipelines, deduplication across retried webhooks, proration when a customer upgrades mid-cycle, refund logic when a metered event is later invalidated — each of these is a distinct problem, and most SaaS teams underestimate all of them.
This post covers how to build a realistic usage-based billing architecture in 2026, using Stripe Metering as the primary example, with reference to Lago for teams that need more flexibility. The goal is not a complete implementation guide but a map of the problems you will encounter and the patterns that actually work.
What Makes Usage-Based Billing Engineering Hard
The gap between a usage-based pricing model and a flat-rate subscription is larger than it appears. A flat-rate subscription is a table lookup: is the customer on a $99/month plan? Bill $99. Usage-based billing is a pipeline: collect events, aggregate them, apply rate tiers, account for plan changes, apply credits, generate an invoice, reconcile with your internal records, and handle disputes when a customer disagrees with the number.
Every step in that pipeline has failure modes. Events arrive late. Events arrive twice. A customer changes plans on the 14th of the month and you need to prorate both the old and new plan correctly. A batch job that generates usage records runs twice due to a deploy retry and double-counts 12 hours of data. A customer claims they used 50,000 API calls but your meter says 61,000, and you need to reconstruct what happened from logs.
The teams that handle this well share one architectural commitment: they treat usage events as immutable facts and separate the concerns of event ingestion, aggregation, billing calculation, and invoice generation into distinct stages that can each be inspected, replayed, and corrected independently.
Event Ingestion Architecture
The foundation of any usage-based billing system is the event ingestion layer. Every billable action — an API call, a message sent, a gigabyte stored, a model inference completed — generates an event that needs to be captured reliably and exactly once.
Exactly-once delivery at the ingestion layer is a meaningful engineering constraint. Your application code typically fires events in response to HTTP requests or background jobs. Both contexts can experience retries: a request handler might retry a failed downstream call; a job worker might re-execute a job that timed out. If you naively emit a billing event for each execution, retries produce duplicate events.
The standard solution is idempotency keys. Every billing event should carry a key that uniquely identifies the real-world action it represents, independent of how many times the event was emitted. Stripe's Metering API accepts an identifier field on each event precisely for this purpose:
await stripe.billing.meterEvents.create({
event_name: 'api_calls',
payload: {
stripe_customer_id: customer.stripeId,
value: '1',
},
identifier: `api-call-${requestId}`, // idempotency key
timestamp: Math.floor(requestTimestamp / 1000),
});
With an idempotency key in place, Stripe deduplicates events with the same identifier within a 24-hour window. For your own meter or Lago, you implement the same pattern: store the event identifier in a database table with a unique constraint, and discard events whose identifier already exists.
The choice of what to use as an idempotency key matters. A good key encodes the specific real-world action: the ID of the API request, the ID of the job execution, the hash of the message content plus its timestamp. A bad key is any key generated at emit time — a UUID created when the event fires — because retries generate new UUIDs and the deduplication logic never sees them as duplicates.
Stripe Metering in Practice
Stripe's Billing Meter is the right choice for most SaaS teams in 2026 if you are already using Stripe for subscriptions. It handles aggregation (sum, count, max) per customer per billing period, stores the raw event data for up to 13 months, and feeds directly into Stripe's invoice generation.
Setting up a meter requires deciding on the aggregation formula and the event payload schema upfront. A meter that counts API calls looks different from one that bills on peak storage in gigabytes or on the maximum concurrent connections in a billing period. The aggregation field on the meter object controls this:
const meter = await stripe.billing.meters.create({
display_name: 'API Calls',
event_name: 'api_calls',
default_aggregation: {
formula: 'sum', // 'sum', 'count', or 'max'
},
customer_mapping: {
event_payload_key: 'stripe_customer_id',
type: 'by_id',
},
});
A meter configured with count ignores the value in the payload and simply counts events per customer per period. A meter with sum adds up the value field across all events. A meter with max returns the highest single value reported — useful for peak-based billing models like "billed on highest concurrent connections this month."
Once events are flowing into the meter, you link it to a price on a subscription item:
const price = await stripe.prices.create({
currency: 'usd',
product: productId,
billing_scheme: 'per_unit',
unit_amount: 15, // $0.00015 per call
recurring: {
interval: 'month',
usage_type: 'metered',
meter: meter.id,
},
});
At invoice time, Stripe queries the meter for the total usage per customer for the period and generates a line item automatically. For most teams this removes the need to write aggregation logic — but it does not remove the need to verify it. Build a reconciliation job that queries Stripe's meter event summary API and compares it against your internal usage database. Discrepancies happen. Finding them before your customer does is significantly better than finding them during a billing dispute.
Lago as an Alternative
Lago is an open-source billing engine worth serious consideration when your pricing model is too complex for Stripe Metering, or when you need billing logic that lives in your own infrastructure. Lago supports billable metrics with custom aggregation formulas, graduated price tiers, pay-in-advance and pay-in-arrears models, and real-time usage previews — capabilities that would require custom code on top of Stripe.
The tradeoff is operational complexity. Lago is a service you deploy and maintain (or use as a hosted product). Your event ingestion pipeline sends events to Lago's API, Lago handles aggregation and invoice generation, and you integrate with Stripe or another payment processor for actual payment collection. More moving parts, but more control.
For teams building complex SaaS products — particularly those with volume discounts, credit systems, or multi-dimensional pricing — Lago's flexibility often justifies the additional infrastructure. For teams with straightforward metered pricing on a single dimension, Stripe Metering with careful reconciliation is sufficient.
Proration: The Edge Case Most Teams Get Wrong
Proration is what happens when a customer changes plans mid-billing-cycle. A customer on a $200/month plan with 10,000 included API calls upgrades to a $500/month plan with 50,000 included calls on the 14th of a 30-day month. What do you owe them, and what do they owe you, for the current cycle?
Stripe handles basic subscription proration automatically when you update a subscription item, crediting the unused portion of the old plan and billing the prorated portion of the new plan. The complexity comes when usage-based components are involved.
The problem is that the meter does not reset on a plan change — the customer has already consumed some usage on the old plan, potentially against a different rate or tier structure. When the cycle closes at month end, the full period's usage is billed at the new plan's rates unless you explicitly close the current meter period and open a new one at the time of the plan change.
The pattern that works reliably is to invoice immediately at plan change time to close the old billing period, then update the subscription with a new billing cycle anchor:
// At plan change time:
// 1. Invoice immediately to close the current period
await stripe.invoices.create({
customer: customer.stripeId,
subscription: subscription.id,
pending_invoice_items_behavior: 'include',
});
// 2. Update subscription with a reset cycle
await stripe.subscriptions.update(subscription.id, {
items: [{
id: oldItemId,
deleted: true,
}, {
price: newPriceId,
}],
proration_behavior: 'none', // handled manually above
billing_cycle_anchor: 'now', // reset the cycle from today
});
This approach is more code than letting Stripe handle proration automatically, but it gives you a clean audit trail: one invoice covers the old plan period, the next covers the new plan period from the change date. Finance teams understand this model; they struggle with Stripe's automatic proration credits, which appear as line items in the next invoice with descriptions that require explanation.
Deduplication Beyond the Ingestion Layer
Ingestion-layer deduplication handles duplicate events from retried HTTP requests. A separate class of duplicates comes from the aggregation and reporting layer, and it is harder to detect.
The scenario: a nightly job aggregates the previous day's usage from your database and reports it to your billing system. The job runs at 2:00 AM. At 2:30 AM, a deployment causes a job retry. The job runs again, finds the same day's usage records, and reports them again. If your billing system does not deduplicate at the reporting layer — or if you did not design idempotency into the job — you have now double-counted a full day of usage for every customer.
Three things prevent this. First, design aggregation jobs to be idempotent: write aggregated results to a table with a unique constraint on (customer_id, period_date, meter_name), so a second run upserts the same values rather than inserting new rows. Second, use idempotency keys at the reporting layer — when reporting aggregated usage to Stripe or Lago, encode the period and customer in the key so the billing system discards duplicates. Third, run a daily reconciliation job that compares total reported usage in your billing system against total events in your internal database for the same period; any mismatch triggers an alert before the next invoice cycle runs.
Handling Free Tiers and Credits
Most SaaS products with usage-based pricing include a free tier: "First 10,000 API calls per month are free." Engineering this correctly requires tracking consumed credits separately from billable usage.
For a standard graduated tier, Stripe handles the threshold cleanly — $0 per unit for the first 10,000, $0.00015 per unit thereafter. Configure this in the price object's tiers array and set tiers_mode to graduated. Stripe computes the tier splits automatically at invoice time.
For pre-paid credit packs — "buy 1M API calls for $100" — you are managing a ledger that depletes as events are reported. This ledger lives in your database, not in Stripe. Stripe does not natively model "customer has X units remaining from a purchased pack." You track the pack balance, reduce it as usage events arrive, and only begin reporting billable usage to Stripe after the pack is exhausted. This requires careful sequencing: your usage event handler must check the ledger before deciding whether to send an event to the meter.
The ledger approach also means your reconciliation job needs to account for two populations of usage: pack-consumed events (which should not appear as Stripe meter events) and billable events (which should). A mismatch between the sum of both and your raw event count is a signal that something is wrong in the routing logic.
Reconciliation as a First-Class Concern
Every usage-based billing system should treat reconciliation as a first-class engineering concern, not a quarterly finance task. The goal is to close the gap between what your system records as usage and what Stripe invoices — before your customers see a discrepancy.
A minimal reconciliation setup covers three layers. At the event level, compare the count of events in your internal database against the count returned by Stripe's meter event summary API for each billing period, and flag any customer where the numbers diverge by more than 0.1%. At the invoice level, after each invoice is generated, compare the line item amounts against an independent calculation from your own usage database. At the audit log level, every event that touches a customer's usage record — event received, event deduplicated, aggregate reported, credit applied, invoice line item generated — should be queryable from a single place so you can reconstruct exactly what produced a given number.
Wolf-Tech has helped several SaaS teams build this reconciliation infrastructure as part of broader custom software development engagements. In every case, the value was not in finding bugs that had already been noticed — it was in finding a class of quiet discrepancies accumulating below the threshold that triggered customer complaints but well above zero.
What a Realistic Architecture Looks Like
Pulling these pieces together, a production-ready usage-based billing architecture for a SaaS product in 2026 looks something like this.
Your application emits events to an internal event queue (Kafka, RabbitMQ, or SQS) rather than calling the billing API synchronously on the request path. A billing consumer reads from the queue, deduplicates events using an idempotency key store (Redis with TTL, or a Postgres table with a unique constraint), applies any in-flight credits from pre-paid packs, and forwards billable events to Stripe Metering. A separate reconciliation service runs nightly to compare internal records against Stripe's meter summaries and alert on discrepancies. A thin billing admin interface lets support staff view a customer's event history, apply credits, and generate explanatory breakdowns for invoice disputes.
This architecture decouples your application from the billing system's availability, makes every event traceable, and gives you replay capability when bugs are found. It is more infrastructure than a direct Stripe API call in the request handler, but it is also the difference between a billing system you can trust and one you are constantly firefighting.
A tech stack strategy review is often where teams first identify that their billing pipeline is architecturally fragile — the symptoms show up as unexplained invoice discrepancies and support load, not as obvious code bugs.
Getting It Right
Usage-based billing engineering is not glamorous work. It is a combination of careful event schema design, disciplined deduplication, edge-case handling for plan changes and credits, and ongoing reconciliation. Teams that do it well build trust with their finance functions and their customers. Teams that do it poorly spend engineering time on billing disputes instead of product development.
If your team is moving to usage-based pricing or fixing a metering system that is producing inconsistent numbers, the problem is almost always in the deduplication layer or the proration logic — and the fix is almost always architectural rather than tactical.
Wolf-Tech works with SaaS engineering teams in Europe and the US on billing infrastructure and platform architecture. If you are evaluating Stripe Metering versus Lago, designing your event schema, or debugging reconciliation discrepancies, reach out at hello@wolf-tech.io or visit wolf-tech.io for a conversation about your specific setup.

