Building a Notification System for SaaS: Channels, Preferences, and Delivery Guarantees
Every SaaS product eventually needs notifications. A new message arrives, a subscription is about to expire, a teammate mentions you in a comment -- the user needs to know. The first implementation is usually a direct call: send an email here, push a Slack message there. It works for a weekend project. It stops working the moment you add a second channel, let users configure their preferences, or operate a system where retrying a failed delivery can send the same notification twice.
Building a notification system for SaaS that actually scales requires thinking through four problems at once: which channels you support, how users express their preferences, how you guarantee delivery without duplication, and how the whole thing stays maintainable as requirements grow. This post walks through each of those layers in practical terms.
Why Notification Systems Get Complicated Fast
The naive implementation treats notifications as fire-and-forget side effects: a user signs up, so you call $mailer->send(...) inline. This creates a cluster of problems that only surface in production.
Coupling to the delivery mechanism. Your business logic now depends on an external service. If the SMTP relay is slow, your signup endpoint is slow. If it is down, signups fail.
No retry logic. Transient failures -- network blips, rate limits, provider outages -- silently drop notifications with no recovery path.
No preference support. Some users want email. Some want Slack. Some want nothing at all. Hardcoding a single channel means either annoying everyone or building preference logic scattered across every call site.
Double-send on retries. Once you add a job queue to solve the first two problems, you introduce a new one: if the worker crashes after delivering but before acknowledging, the job runs again and the user gets two copies of the same notification.
Each of these is solvable. The key is to solve them in the right order, with a design that keeps each concern in its own layer.
Layer 1: The Notification Event
The foundation of a clean notification system is separating the intent from the delivery. Instead of calling a mailer or a Slack client directly, your application code emits a notification event -- a plain data object that describes what happened, to whom, and what context is relevant.
final class SubscriptionExpiringNotification
{
public function __construct(
public readonly string $userId,
public readonly \DateTimeImmutable $expiresAt,
public readonly string $planName,
) {}
}
This object carries no knowledge of how it will be delivered. It just describes the fact: user $userId has a subscription expiring at $expiresAt. A dispatcher accepts the event and hands it off to the delivery layer. Your application code emits the event and moves on.
This separation gives you several things immediately: the business logic is testable without mocking mailers, you can add new channels without touching the emitting code, and you have a natural point to apply user preferences before any delivery happens.
Layer 2: User Preferences
User preferences should be stored as explicit rules, not as boolean flags per channel. A boolean approach -- email_notifications: true, slack_notifications: false -- forces a schema migration every time you add a channel. A rule-based approach stores the mapping between event types and channels, and lets users configure it per event category.
A simple schema stores one row per user per event type per channel:
CREATE TABLE notification_preferences (
user_id UUID NOT NULL,
event_type VARCHAR(100) NOT NULL,
channel VARCHAR(50) NOT NULL,
enabled BOOLEAN NOT NULL DEFAULT TRUE,
PRIMARY KEY (user_id, event_type, channel)
);
When a notification event arrives, the dispatcher queries this table to determine which channels to use for this user. If no preference exists, fall back to a sensible default -- typically email only. This gives users control without requiring them to configure anything before the system works.
One practical consideration: event types should be grouped into categories in the UI ("Account & Billing", "Team Activity", "Security Alerts") even if stored as individual types in the database. Security-critical notifications -- password changes, new login from an unknown device -- should be non-optional and always delivered via email regardless of preference. Make this explicit in the schema or in the dispatcher logic, not as a special case buried in the delivery code.
Layer 3: Multi-Channel Delivery
Each channel is a separate adapter implementing a shared interface:
interface NotificationChannel
{
public function supports(string $eventType): bool;
public function deliver(NotificationEnvelope $envelope): void;
}
The NotificationEnvelope wraps the original event with resolved recipient details -- email address, Slack user ID, push token -- so each adapter has what it needs without querying the database again. Adapters are registered in a collection and selected based on the user's preferences and the event type.
Email, in-app, Slack, and push each have genuinely different failure modes and rate limits. Email providers like Postmark or SendGrid are generally reliable but have per-account sending limits. Slack API calls can fail with rate_limited responses that require exponential backoff. Push notifications via FCM or APNs are fire-and-forget with no delivery receipt. Design each adapter to handle its channel's failure modes explicitly rather than relying on a generic retry wrapper.
If you offer an in-app notification feed -- the bell icon with a badge count -- treat it as a separate storage write rather than a delivery channel. Write the notification record directly to your database. This channel never fails and does not need queuing. Use it as a reliable audit trail of what the system attempted to deliver.
Layer 4: Queuing and Delivery Guarantees
Reliable delivery requires a persistent queue. The dispatcher does not deliver notifications directly; it enqueues a delivery job and returns immediately. The job carries enough information to re-execute the delivery attempt independently: the serialized event, the resolved channel, and a unique notification ID.
Symfony Messenger with a database transport or RabbitMQ works well here. The important configuration decisions are:
Transport persistence. Use a durable transport. An in-memory queue loses all pending notifications on restart. A database-backed transport (Doctrine transport in Symfony Messenger) survives restarts at the cost of slightly higher latency.
Retry policy. Configure a retry delay with exponential backoff and a maximum retry count. Three to five retries with delays of 1 minute, 5 minutes, and 30 minutes covers most transient failures without hammering a provider that is genuinely down.
Dead letter queue. Jobs that exhaust their retries should land in a dead letter queue, not be silently dropped. Alert on dead letter queue growth -- it is a leading indicator of a channel provider outage or a bug in an adapter.
If your workers can fail after delivery but before acknowledging the job, you need idempotency at the delivery layer, which brings us to the most important problem in the whole system.
Idempotency: The Hard Part
A job queue with retries creates an at-least-once delivery guarantee: the job will run at least once, but may run more than once if the worker crashes at the wrong moment. For most background jobs, this is fine -- running a report generation job twice is wasteful but harmless. For notifications, it is a user-experience problem. Nobody wants two copies of "Your invoice is ready."
The solution is to make delivery idempotent by tracking which notifications have already been delivered. Before any channel adapter sends a message, it checks whether a delivery record already exists for this notification ID and channel. If it does, the adapter skips the delivery and returns success. If it does not, the adapter delivers and writes the record atomically.
CREATE TABLE notification_deliveries (
notification_id UUID NOT NULL,
channel VARCHAR(50) NOT NULL,
delivered_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (notification_id, channel)
);
The notification ID must be generated before the job is enqueued, not inside the job. Generate it when the event is first dispatched, include it in the job payload, and use it as the deduplication key. If you generate the ID inside the job, each retry gets a new ID and the deduplication check never fires.
For email, there is an additional layer: most email providers accept an idempotency key or message ID header that prevents them from delivering the same message twice even if you call their API multiple times. Postmark uses a MessageID you can set explicitly. SendGrid supports a x-message-id tracking parameter. Use this in combination with your own delivery tracking table for belt-and-suspenders deduplication.
Putting It Together: The Dispatcher
The dispatcher is the central coordinator that ties these layers together:
- Receive the notification event
- Generate a stable notification ID (UUID v5 derived from event type + user ID + a timestamp bucket, so identical events in a short window deduplicate naturally)
- Look up the user's channel preferences for this event type
- Filter out channels that are disabled or override-forced for this event type
- For each resolved channel, enqueue a delivery job with the notification ID, event payload, and channel
- Write an in-app notification record directly (no queue needed)
Each queued job then runs independently, checks the delivery table, delivers if not already delivered, and writes the delivery record.
This design gives you at-least-once delivery with deduplication at the adapter layer, graceful degradation when a single channel fails, and a complete audit trail of what was attempted and what succeeded.
Scaling Considerations
A single PostgreSQL table for delivery tracking handles millions of rows without trouble, but add a partial index on recent notifications if you query for recently delivered state:
CREATE INDEX idx_notification_deliveries_recent
ON notification_deliveries (delivered_at)
WHERE delivered_at > NOW() - INTERVAL '30 days';
For high-volume notifications -- weekly digests, marketing announcements -- consider a separate digest queue that batches per-user events and delivers one message per period rather than one per event. This is a separate system from the transactional notification pipeline and should be treated as such.
Rate limiting per user and per channel is worth adding early. A bug in event emission code that triggers ten thousand "new message" notifications for the same user is a real incident. A simple counter in Redis with a TTL catches this before the queue fills with garbage.
Where to Go From Here
If you are building a SaaS product and this notification architecture feels more complex than your current team can absorb at once, that is a signal worth taking seriously. Getting the foundations right -- clean separation between event emission and delivery, idempotent adapters, a sensible preference model -- saves weeks of debugging later when the edge cases surface in production.
Wolf-Tech helps SaaS teams design and build exactly this kind of infrastructure. Whether you need a full system designed from scratch or a review of what you have already built, reach out at hello@wolf-tech.io or visit wolf-tech.io. If you are evaluating your current codebase before adding new capabilities, our code quality consulting and custom software development services are a good starting point.
A notification system that respects user preferences and never double-sends is not a luxury feature -- it is the kind of infrastructure that separates products users trust from products they eventually unsubscribe from.

