The Problem: PII is Everywhere
When a user requests erasure under GDPR Article 17, you have 30 days to delete all their personal data. In theory, this is simple. In practice, a typical SaaS application stores user data across 5-15 tables:
users(email, name, phone)profiles(address, date of birth, avatar)orders(shipping address, billing name)sessions(IP address, user agent)payments(IBAN, card last four)audit_logs(who did what, when)support_tickets(message content, attachments)
Miss one table, and you're non-compliant. Worse: you probably can't prove you deleted everything, because you have no audit trail of the deletion itself.
Why Manual Deletion Fails
The naive approach is a series of DELETE statements:
DELETE FROM sessions WHERE user_id = 'user-4821';
DELETE FROM payments WHERE user_id = 'user-4821';
DELETE FROM orders WHERE user_id = 'user-4821';
DELETE FROM profiles WHERE user_id = 'user-4821';
DELETE FROM users WHERE user_id = 'user-4821';
This has three problems:
- It's incomplete. When a developer adds a new table with PII, the deletion script doesn't get updated. The Confluence doc is 14 months old.
- It's not atomic. If the third DELETE fails, you've partially deleted a user — some tables clean, some not.
- There's no proof. An auditor asks "show me evidence of deletion" and you have nothing.
The Solution: A PII Registry
The core insight is that you need a single source of truth for where PII lives in your database. Not a spreadsheet. Not a Confluence doc. A registry that lives in the database itself.
Step 1: Install pgcomply
\i pgcomply.sql
This creates the pgcomply schema with all functions and tables. Takes about 2 seconds.
Step 2: Register Every PII Column
-- Users table
SELECT pgcomply.register_pii('users', 'email', 'email', 'user_id');
SELECT pgcomply.register_pii('users', 'name', 'person_name', 'user_id');
SELECT pgcomply.register_pii('users', 'phone', 'phone', 'user_id');
-- Profiles table
SELECT pgcomply.register_pii('profiles', 'street', 'address', 'user_id', 'anonymize');
SELECT pgcomply.register_pii('profiles', 'date_of_birth', 'date_of_birth', 'user_id', 'null');
-- Sessions table (delete entire row)
SELECT pgcomply.register_pii('sessions', 'ip_addr', 'ip_address', 'user_id', 'delete');
Or use bulk registration:
SELECT pgcomply.register_pii_bulk(ARRAY[
ROW('users', 'email', 'email', 'user_id', 'delete')::pgcomply.pii_registration,
ROW('users', 'name', 'person_name', 'user_id', 'delete')::pgcomply.pii_registration,
ROW('profiles', 'street', 'address', 'user_id', 'anonymize')::pgcomply.pii_registration
]);
Forget strategies determine what happens during erasure:
| Strategy | Behavior | Use When |
|----------|----------|----------|
| delete | Deletes the entire row | Session logs, temp data |
| null | Sets column to NULL | Need row for analytics, remove PII |
| anonymize | Replaces with fake data | Need realistic data for reporting |
Step 3: Execute Erasure
SELECT pgcomply.forget('user-4821');
One function call. It:
- Finds every table registered for this user ID
- Executes the configured strategy (delete/null/anonymize)
- Logs the operation in the immutable audit trail
- Returns a summary of what was deleted
Output:
NOTICE: pgcomply.forget: users — 1 row(s) deleted
NOTICE: pgcomply.forget: profiles — 1 row(s) anonymized
NOTICE: pgcomply.forget: sessions — 47 row(s) deleted
NOTICE: pgcomply.forget: payments — 3 row(s) deleted
Step 4: Verify Completeness
SELECT * FROM pgcomply.verify_forget('user-4821');
This scans every registered PII column and checks for residual data:
table_name | column_name | pii_type | status
------------+-------------+--------------+--------
users | email | email | clean
users | name | person_name | clean
profiles | street | address | clean
sessions | ip_addr | ip_address | clean
payments | iban | financial | clean
If anything remains, the status shows residual_found — meaning the forget operation missed something.
The Audit Trail
Every operation is logged immutably:
SELECT * FROM pgcomply.audit_log
WHERE details->>'subject_id' = 'user-4821'
ORDER BY created_at DESC;
The audit log uses SHA-256 chaining — each entry's hash depends on the previous entry's hash. Any tampering (modification, deletion, insertion) breaks the chain, detectable via:
SELECT pgcomply.verify_audit();
This is the evidence your auditor needs.
Handling Edge Cases
What about data in foreign tables?
Register them like any other table. pgcomply works with any table in the same PostgreSQL instance.
What about data in external services?
pgcomply handles the database layer. For external services (email providers, analytics platforms), use webhooks to trigger cleanup:
-- Pro: Set up a webhook to notify external services
SELECT pgcomply.add_webhook(
'https://api.your-app.com/gdpr/cleanup',
ARRAY['forget_complete']
);
What about retention requirements?
Some data has legal retention periods (e.g., invoices must be kept for 7 years in Germany). Use the null strategy to remove PII while keeping the record:
-- Keep the order, remove the person
SELECT pgcomply.register_pii('orders', 'billing_name', 'person_name', 'user_id', 'null');
Preventing Future Drift
New tables get added. New columns appear. PII creeps in without anyone updating the registry. pgcomply catches this automatically:
SELECT * FROM pgcomply.schema_drift();
This scans for unregistered columns matching PII patterns (email, phone, name, address, IP, IBAN, etc.) and flags them before your auditor does.
Summary
Implementing GDPR Right to Erasure in PostgreSQL requires three things: knowing where PII lives, deleting it reliably, and proving you did it. pgcomply provides all three with pure SQL — no sidecar, no external service, no agent. Install in 2 seconds, register your PII, and forget() handles the rest.