GDPR Right-to-Erasure Engineering: Actually Deleting Users From Complex SaaS Systems
In almost every GDPR audit we run on a mid-size European SaaS, the same scene plays out. A senior engineer opens the admin console, clicks "Delete user", a green toast appears, and everyone agrees the company is compliant with Article 17. Then we ask three questions. Are the user's rows still in the nightly Postgres dump on S3? Yes. Did the deletion sync to the warehouse where marketing queries cohorts? No. Did the audit log retain their email address as the actor on every action they ever took? Yes. The polite term is "non-conforming". The accurate term is that the company has been quietly violating the GDPR right to erasure for years and the next supervisory authority complaint will surface it.
Article 17 is not a UI feature — it is a distributed systems problem with legal deadlines. The moment a product grows beyond a single Postgres database, honouring it becomes one of the harder engineering jobs a SaaS team will face. This post walks through what an honest right-to-erasure implementation looks like in a real, multi-service architecture: how to handle backups without breaking disaster recovery, what to do with audit logs and analytics, when tombstone records are appropriate, and how to orchestrate deletion across systems you do not directly control.
What Article 17 actually requires
The right to erasure obliges you to delete personal data without undue delay when a data subject withdraws consent, the processing was unlawful, or the data is no longer necessary for the purpose it was collected for. "Without undue delay" is interpreted by most European supervisory authorities as within one month, with a possible extension of two further months for complex cases — and you must respond to the data subject within thirty days even if you need the extension.
Three points trip up engineering teams. First, the right is not absolute: legal retention obligations (tax law, anti-money-laundering, employment records) override it for the data those laws cover, but only for that data and only for the retention period the law specifies. Second, "deletion" does not mean a status = 'deleted' flag on the row — that is a soft delete, and soft-deleted personal data is still personal data being processed. Third, the obligation extends to processors: anywhere you have shipped this user's data — Stripe, HubSpot, Mixpanel, your hosted Elasticsearch, an external data warehouse — the controller is responsible for ensuring deletion propagates.
This is what makes GDPR data deletion a genuine engineering problem rather than a paperwork one. A modest B2B SaaS typically replicates user data into eight to fifteen distinct systems. An "honest" right-to-erasure flow has to find them all.
Mapping the data: the work nobody wants to do
Every serious right-to-erasure implementation starts with a data flow inventory, and there is no shortcut. You need to walk every place a user identifier lands and decide what happens to it on deletion. We typically organise this as a table with one row per system and four columns: data category, lawful basis, retention rule, deletion mechanism.
A representative inventory for a Symfony + Next.js SaaS looks something like this. Primary Postgres holds user, account, billing and content tables, with read replicas mirroring it. A full-text search index (Meilisearch, Elasticsearch) carries a denormalised copy of names, emails and content. The audit log lives in a separate, append-only store. The data warehouse (BigQuery, Snowflake) receives nightly ETL. Customer support holds conversation history in Zendesk or Intercom; marketing has HubSpot; product analytics has Mixpanel or PostHog; error tracking is Sentry, where breadcrumbs routinely contain user IDs and emails. Email infrastructure (Postmark, SendGrid) has its own logs. Object storage has uploaded files, often with the user ID in the path. And then there are backups: Postgres base backups, WAL archives, S3 object versioning, warehouse snapshots.
Every one of those needs a defined deletion path. The deliverable from this exercise is a "data subject request runbook" — an internal document that, for every system, says what is deleted, what is retained under what legal basis, and how long it takes. This runbook is also what your DPO will produce in response to a supervisory authority complaint, so write it like you mean it.
The backup problem, honestly
The single most common failure mode is pretending backups are out of scope. They are not. The CNIL, the Irish DPC and the German state DPAs have all published guidance making clear that backups are part of the processing, even though they receive a more pragmatic treatment than live systems.
The pragmatic treatment looks like this. You do not need to restore every backup, run a deletion against it, and re-archive it — that would destroy your disaster recovery posture. What you do need is a documented policy that says: backups are encrypted; the retention period is bounded (typically 30 to 90 days); on deletion of a user from production, that user's identifier is added to a "deletion register"; if a backup is ever restored, the register is replayed before the restored data is exposed to the application; the backup then naturally expires the personal data after the retention window.
This is defensible in front of a regulator if and only if three things are true: the retention window is short and documented, the deletion register exists and is exercised in restore drills, and no backup older than the retention window is kept "for safety". That last decision is what turns a defensible policy into a non-conforming one. The same logic applies to S3 object versioning, warehouse time-travel features and database PITR — bound the window, document it, replay deletions on restore.
Audit logs, tombstones and the legitimate-interest line
The hardest deletion question in most SaaS products is the audit log. The product team needs the log for security investigations and customer trust. The legal team needs it for forensics. The compliance team needs it for the very GDPR they are also asking you to delete it for. And the audit log is, by design, append-only.
The defensible position is that audit logs of authentication, authorisation and security-relevant events can be retained on a legitimate-interest basis under Article 6(1)(f), but only if three things hold: the retention period is bounded and proportionate (twelve months is the figure most German DPAs accept without argument; longer requires a case), the actor identifier in retained log lines is replaced with a pseudonym on user deletion, and the legitimate interest is documented in a transparency notice the user can read.
In engineering terms that means audit log records should reference users by an internal pseudonymous ID — a UUID with no email or name embedded — and a separate users table maps those UUIDs to identities. On a right-to-erasure event, the users mapping row is hard-deleted, the audit log retains the UUID, and the records become genuinely de-identified. This pattern is sometimes called a tombstone record: the row that remains carries no personal data and cannot be re-associated with a person.
Tombstones are also the right answer for foreign-key integrity. A comments table that references user_id cannot lose its foreign key when the user deletes their account without breaking thread structure. The clean solution is to redirect the FK to a "deleted user" tombstone row that contains no personal data, then hard-delete every column on the original row that is or could be PII — not just email but display_name, avatar_url, bio, last_seen_ip. We see teams routinely overlook IP addresses and last-seen timestamps; both qualify as personal data under EU case law.
The analytics and warehouse propagation problem
The second-hardest problem is the data warehouse. ETL pipelines typically replicate the production database forwards once a day. Most teams never built the reverse path — the backwards delete propagation. As a result, BigQuery contains a complete history of every user the company ever had, long after they were deleted from production.
There are two clean patterns. The first is identifier-only warehousing: the production database streams events to the warehouse with a pseudonymous ID instead of email or name, and a separate small table of identity mappings is truncated and rebuilt nightly from production. When a user is deleted in production, the next nightly rebuild simply does not include them, and warehouse queries that join on the mapping return null for their identity from that point on.
The second pattern is an explicit deletion topic: production publishes a user.deleted event to Kafka, Pub/Sub or whatever stream the warehouse consumes. The warehouse pipeline subscribes to it and runs a deterministic delete on each downstream table within an SLA — typically 24 hours. We strongly prefer the second pattern for any warehouse that retains data for more than a few months, because it makes the deletion auditable: you can show a regulator the message bus topic, the consumer SLO, and the test suite that proves the consumer works.
Either pattern is fine. Doing neither is not.
Third-party processors: the long tail
Every SaaS has a long tail of third-party processors holding user data: payment providers, email senders, support tools, analytics, error tracking. The GDPR makes the controller responsible for ensuring deletion across this surface. Engineering work runs in two directions: every processor must support a deletion API (Stripe customer.del, HubSpot GDPR-delete, Mixpanel data deletion, PostHog delete_person, Sentry user-deletion); and deletion has to be orchestrated, not manual. A flow that calls one API, fails on the second, and silently leaves five systems with the user's data is worse than no flow at all because it gives the company false confidence. The correct architecture is an orchestration job — a Symfony Messenger consumer, a Temporal workflow, a Step Functions state machine — that fans out deletion calls, retries on transient failure, surfaces permanent failures to a human, and records the outcome per processor in a gdpr_deletion_audit table. That table is what the company can produce when asked. Vendors that require a CSV upload or a support ticket should be on a list to replace.
A simplified Symfony Messenger handler, just to make the shape concrete:
final class DeleteUserHandler
{
public function __construct(
private readonly array $deleters, // [name => DeleterInterface]
private readonly DeletionAuditRepository $audit,
private readonly MessageBusInterface $bus,
) {}
public function __invoke(DeleteUserCommand $cmd): void
{
foreach ($this->deleters as $processor => $deleter) {
try {
$deleter->delete($cmd->userId);
$this->audit->recordSuccess($cmd->userId, $processor);
} catch (TransientException $e) {
$this->bus->dispatch(new RetryDeletion($cmd->userId, $processor));
$this->audit->recordRetry($cmd->userId, $processor, $e->getMessage());
} catch (\Throwable $e) {
$this->audit->recordFailure($cmd->userId, $processor, $e->getMessage());
}
}
}
}
The shape matters more than the language: each processor is a discrete step, each step writes to the audit table, transient failures are retried, permanent failures are surfaced, and the whole thing is idempotent so a partial run can be safely re-driven.
A pragmatic right-to-erasure checklist
Most teams that have already shipped a "delete account" button can tighten their compliance posture inside a quarter without a from-scratch rewrite. The work clusters into six items.
The first is the data flow inventory described above — a row per system, lawful basis written down, retention period decided. The second is the database hardening: foreign keys redirected to a tombstone row, every PII column on user-owned rows actually hard-deleted, IP and last-seen fields explicitly addressed. The third is audit log pseudonymisation: actor IDs become opaque UUIDs, the identity mapping is the deletable surface. The fourth is warehouse propagation through one of the two clean patterns above. The fifth is the orchestration job for third-party processors, with audit records per processor. The sixth is the backup policy: a documented, bounded retention window with a deletion register replayed on restore.
Underneath all of this is a discipline question: every new system that joins the architecture needs a deletion path defined the day it ships, not retrofitted three years later. We see this most often when teams add a new analytics tool, a new search index, or a new AI feature that quietly stores user prompts in a vector database — none of those typically arrive with a deletion strategy.
When the engineering bill is bigger than it looks
Retrofitting right-to-erasure into a system that was not built for it is typically six to twelve weeks of focused engineering, plus another four to six weeks of legal and DPO involvement for the policies and the privacy notice. Teams that try to do it in two weeks ship the half-implemented flow this post is about, and then carry the regulatory risk indefinitely. The bill grows when the architecture pre-dates the GDPR — older PHP applications often store IPs in seven different tables, ship denormalised emails into reporting views, and predate the concept of pseudonymous identity. That is the kind of work a legacy code modernization engagement is designed to absorb, alongside the database hygiene a code quality audit typically surfaces first.
If you are a CTO or DPO at a European SaaS and you are not sure whether your right-to-erasure flow would survive a supervisory authority audit, that is a useful conversation to have before the complaint arrives. Contact us at hello@wolf-tech.io or visit wolf-tech.io — eighteen years of European software work, including substantial GDPR remediation under regulatory pressure, sits behind every recommendation we make.

