GDPR Right-to-Erasure Engineering: Actually Deleting Users From Complex SaaS Systems
A B2B SaaS team I reviewed last quarter had a "GDPR Delete My Account" button in their product. Clicking it set users.deleted_at to the current timestamp and emailed support. That was it. Audit logs still stored full names and email addresses. Their nightly backup retained every deleted record for 90 days. The data warehouse continued to track the user's behavioural events, keyed by a user_id that never disappeared. When a customer eventually filed a formal Article 17 request and asked for evidence of deletion, the team needed three engineers and four days to produce something defensible — and the answer was still "mostly yes, with caveats."
This is the gap between GDPR right to erasure on paper and right-to-erasure in production. The legal text is short, the engineering surface is enormous, and most teams discover the second part only when a Data Protection Authority sends a polite letter or a security questionnaire from an enterprise prospect lands on their desk. This post walks through what a realistic deletion pipeline actually looks like across modern SaaS architectures: the inventory step nobody does, the backup problem that has no clean solution, the tombstone pattern you need for foreign keys, and the orchestration logic that keeps multi-system deletion honest under failure.
What "Erasure" Actually Means Under Article 17
GDPR Article 17 grants data subjects the right to obtain erasure of their personal data "without undue delay" — interpreted in practice as within one month, extendable by two more for complex requests. Erasure means the data is no longer processable, not necessarily that every byte is overwritten on every disk in every jurisdiction. The European Data Protection Board's guidance accepts that "putting data beyond use" — in a way that is irreversible, not accessed in normal operations, and securely destroyed at the end of an existing retention cycle — can satisfy Article 17 for backup tapes and similar systems.
Two consequences matter for engineers. First, you are not obliged to break your backup architecture to honour an erasure request the day it arrives; you are obliged to ensure the deleted user does not reappear in production after a restore. Second, you cannot use that flexibility as a blanket excuse: anything reachable through normal application paths — primary database, search index, cache, analytics warehouse, audit log viewer, exported reports — must be properly erased on schedule, with documented evidence.
There are also exceptions worth knowing. Article 17(3) lets you retain data necessary for legal claims, financial record-keeping (typically ten years in Germany under HGB §257), public-interest archiving, or freedom of expression. A "right to be forgotten" request from a user who still owes you money does not erase the invoice trail. The right design separates these two domains cleanly, so the rest of the data can be deleted on time without legal anxiety.
Step One: Inventory Where Personal Data Actually Lives
Most deletion bugs start with a stale data inventory. A typical Symfony + React SaaS at €5–€20M ARR will have personal data in at least the following places, often in more:
- The primary relational database (
users,accounts,team_members,invitations,audit_log,notifications,support_messages) - A search index (Meilisearch, Typesense, Elasticsearch) that mirrors a denormalised view of users and their content
- One or more caches (Redis, Memcached, in-process LRU) holding user-keyed payloads with TTLs measured in hours to days
- Object storage (S3, MinIO, Backblaze B2) for uploads, exports, profile pictures, generated PDFs
- Email and transactional providers (SendGrid, Postmark, Mailgun) with webhook archives, suppression lists, and message logs
- A data warehouse (BigQuery, Snowflake, Postgres + dbt) replicating
users, events, billing data - Product analytics (Amplitude, Mixpanel, PostHog) keyed by
user_idor device ID - CRM and support tools (HubSpot, Salesforce, Intacct, Zendesk) with notes, ticket histories, recorded calls
- Backups, snapshots, and replicas of every database above
Before you write a deletion endpoint, run a personal-data mapping exercise and write the result down. A simple spreadsheet with columns for system, location, identifier, retention basis, deletion mechanism is enough. The inventory is the source of truth for your erasure pipeline; the pipeline implements what the inventory describes. Skipping the inventory is the most common cause of "we forgot we had PII there" incidents — typically discovered during an code quality audit or a procurement security review.
The Backup Problem and Why You Should Not Solve It With Surgery
The single most asked question is: "Do I need to delete the user from every nightly database backup?" Surgical deletion from binary database backups is technically possible, operationally terrifying, and almost never the right answer. Restoring a backup, mutating it, re-encrypting it, and re-uploading it for every erasure request creates more risk than it removes — corrupted backups, broken restore tests, and a transactional integrity surface that is impossible to audit.
The accepted approach is deletion-by-replay. You store erasure requests in an append-only log inside production, separate from the primary tables. When a backup is restored — for disaster recovery or for any other reason — the very first step of the restore runbook is to replay the erasure log against the freshly-restored database, deleting every user marked for erasure between the backup's date and the restore date. The backup itself is not modified; the system that restores it is responsible for not re-introducing already-erased subjects.
Two implementation details make this defensible:
// src/Entity/ErasureRequest.php
#[ORM\Entity]
#[ORM\Table(name: 'erasure_requests')]
class ErasureRequest
{
#[ORM\Id]
#[ORM\Column(type: 'uuid')]
public string $id;
#[ORM\Column(type: 'string', length: 64)]
public string $subjectHash; // SHA-256 of canonical user identifier
#[ORM\Column(type: 'datetimetz_immutable')]
public \DateTimeImmutable $requestedAt;
#[ORM\Column(type: 'datetimetz_immutable', nullable: true)]
public ?\DateTimeImmutable $completedAt = null;
#[ORM\Column(type: 'string', length: 32)]
public string $status; // received | in_progress | completed | failed
}
First, the log stores a hash of the user identifier, not the email or user ID itself. That way the erasure log can outlive the deleted data without re-introducing PII. Second, your backup retention policy must explicitly cover the erasure log: a typical setup is 30 days of full backups, after which the data is gone for good and any related erasure entries can be expired safely. Document this in your privacy notice — DPAs accept it when it is written down and implemented consistently.
Tombstones, Foreign Keys, and the "We Cannot Just DELETE" Reality
In a real schema, hard-deleting a user breaks invariants everywhere. Their invoices, audit-log entries, support tickets, comments and references are still legally meaningful — the transactional fact needs to remain even when the personal data is removed. The pattern that works in production is the tombstone: the user row is anonymised in place, foreign keys remain valid, and dependent tables have their PII columns nulled or hashed.
BEGIN;
-- Anonymise the primary user record but keep the row
UPDATE users
SET email = CONCAT('erased+', id, '@example.invalid'),
full_name = 'Erased User',
avatar_url = NULL,
locale = NULL,
last_login_ip = NULL,
erased_at = NOW(),
status = 'erased'
WHERE id = $1;
-- Strip PII from dependent tables, keep transactional shape
UPDATE audit_log
SET actor_email = NULL, ip_address = NULL, user_agent = NULL
WHERE actor_id = $1;
UPDATE support_messages
SET body = '[content erased]', attachment_url = NULL
WHERE author_id = $1;
-- Hard-delete records that have no legitimate retention basis
DELETE FROM api_tokens WHERE user_id = $1;
DELETE FROM webhook_subscriptions WHERE user_id = $1;
DELETE FROM notification_inbox WHERE user_id = $1;
COMMIT;
A few rules separate a robust tombstone implementation from a leaky one. Use a fake top-level domain like .invalid (RFC 6761) so nobody accidentally emails an erased user. Ensure every PII column on every table is either nulled, redacted, or hashed — not just the obvious ones; user-agent strings, browser fingerprints and IP addresses are personal data under recital 30. Add a CI test that asserts new tables containing user-linked columns also have a corresponding clause in the erasure routine. We add this kind of guard rail routinely during legacy code optimization projects, because it is the cheapest insurance against regressions when the schema grows.
Multi-System Orchestration: Saga Patterns for Erasure
Erasure across the warehouse, the search index, the email provider and your own database is not a single transaction. It is a distributed workflow that has to handle partial failure honestly: HubSpot's API can rate-limit, Snowflake is asynchronous, Meilisearch reindexes lazily. Treat erasure like any other long-running business process — with a saga, a state machine, and idempotent steps.
// src/Service/Erasure/ErasurePipeline.php
final class ErasurePipeline
{
public function __construct(
private readonly UserAnonymizer $primaryDb,
private readonly SearchIndexCleaner $search,
private readonly WarehouseRedactor $warehouse,
private readonly EmailProviderCleaner $email,
private readonly StorageCleaner $storage,
private readonly ErasureRequestRepository $requests,
) {}
public function run(string $requestId): void
{
$request = $this->requests->lockForUpdate($requestId);
$steps = [
'primary_db' => fn() => $this->primaryDb->anonymise($request),
'search' => fn() => $this->search->purge($request),
'storage' => fn() => $this->storage->deleteObjects($request),
'email' => fn() => $this->email->suppressAndPurge($request),
'warehouse' => fn() => $this->warehouse->scheduleRedaction($request),
];
foreach ($steps as $name => $fn) {
if ($request->isStepDone($name)) continue;
try {
$fn();
$request->markStepDone($name);
$this->requests->save($request);
} catch (TransientException $e) {
$this->requests->scheduleRetry($request, $name, $e);
return;
}
}
$request->status = 'completed';
$request->completedAt = new \DateTimeImmutable();
$this->requests->save($request);
}
}
Three properties matter more than the specific framework. Each step must be idempotent — running it twice on a partially-completed request should not create new records or fail loudly. Each step must update the request status before returning, so a worker crash leaves the system in a recoverable state. And the orchestrator must distinguish transient failures (rate limits, network blips) from permanent failures (the user does not exist in the warehouse, the S3 object is already gone). Permanent failures should mark the step done; transient failures should retry with exponential backoff.
For the warehouse specifically, real-time deletion is rarely possible. The pragmatic approach is to maintain a daily DAG that joins your fact tables against the erasure log and either deletes matching rows or replaces PII columns with hashed values. Most data teams already run dbt or Airflow for transformations; adding an erasure_apply step at the end of the daily run keeps the warehouse consistent within the GDPR's "without undue delay" window.
What to Hand the User and the DPO
Closing the loop is part of the obligation. The data subject is entitled to confirmation that their data has been erased, and your DPO is entitled to evidence the system worked. The right deliverable is not a screenshot — it is a generated certificate that summarises which systems were processed, when each step completed, what categories of data remained for legal retention reasons, and the retention end date for those categories. Generating this document directly from the erasure log makes the privacy team's life dramatically easier and gives auditors something concrete to point at.
For B2B SaaS specifically, this becomes a sales asset. Enterprise procurement teams now routinely ask for an evidence package on Article 17 implementation; teams that can hand over a one-page architecture document and a sample certificate are weeks ahead of competitors who answer "yes, we comply." We have helped several Berlin and EU SaaS clients build this exact pipeline as part of broader digital transformation and code quality consulting engagements.
A Realistic Checklist
If you want a starting point, run your current implementation against this list and write down where it fails. The gaps tell you what to build next.
The data inventory exists, is current, and is reviewed at every schema change. The erasure log is append-only, stores hashed identifiers, and is replayed on every backup restore. The primary database deletion is a tombstone operation covering every PII column on every table touched by the user. The search index, cache, object storage and email provider are all handled by named, idempotent steps in a saga. The data warehouse has a daily redaction DAG keyed off the erasure log. Failures are observable, retried with backoff, and produce alerts after a configured threshold. A subject can request and receive an erasure certificate. Retained data has a documented legal basis and a deletion date.
GDPR right to erasure is solvable, but it is solvable only when you treat it as a real engineering surface — with a schema, a workflow, tests, and observability — rather than a compliance checkbox bolted onto a CRUD endpoint. Ship the pipeline once, and the next time a procurement questionnaire or DPA letter arrives, your answer is a link to the architecture document.
If you are staring at a half-implemented deletion flow and a quarterly DPO review on the calendar, we can help. Contact us at hello@wolf-tech.io or visit wolf-tech.io for a free consultation on GDPR-grade data deletion engineering for European SaaS.

