Why Lineage Matters
An auditor asks: "Where does user email data flow in your system?" Without lineage, you search through code, documentation, and developer memory. With lineage:
SELECT * FROM pgcomply.pii_lineage('email');
table_name | column | pii_type | source | purpose | legal_basis | retention | masking | rls
-------------+-------------+----------+----------------+---------------+-------------------+-----------+---------+-----
users | email | email | signup_form | account | contract | — | partial | yes
orders | receipt_email| email | users.email | billing | contract | 7 years | no | no
newsletter | email | email | consent_form | newsletter | consent | until_withdrawn | no | no
sessions | — | — | — | — | — | 30 days | — | —
This is your Article 30 record for email data.
Building Lineage
Step 1: Complete PII Registry
-- Ensure all PII columns are registered
SELECT pgcomply.classify();
SELECT * FROM pgcomply.schema_drift();
-- Address any gaps
Step 2: Document Purposes
-- Link PII to processing purposes
SELECT pgcomply.define_purpose('account', 'Account management', 'contract');
SELECT pgcomply.define_purpose('billing', 'Payment processing', 'contract');
SELECT pgcomply.define_purpose('newsletter', 'Email newsletter', 'consent');
Step 3: Generate Lineage
-- Full lineage for all PII types
SELECT * FROM pgcomply.pii_lineage();
-- Lineage for a specific PII type
SELECT * FROM pgcomply.pii_lineage('financial');
Step 4: Find Gaps
-- PII columns without a documented purpose
SELECT table_name, column_name, pii_type
FROM pgcomply.pii_registry r
LEFT JOIN pgcomply.consent_purposes p ON true
WHERE NOT EXISTS (
SELECT 1 FROM pgcomply.pii_lineage()
WHERE table_name = r.table_name AND purpose IS NOT NULL
);
Automating Lineage Updates
PII lineage must stay current. Automate detection of lineage gaps:
-- Weekly: check for PII without documented purpose
-- Add to your pg_cron schedule:
SELECT cron.schedule('lineage-check', '0 7 * * 1', $$
SELECT table_name, column_name, pii_type
FROM pgcomply.pii_registry r
WHERE NOT EXISTS (
SELECT 1 FROM pgcomply.pii_lineage()
WHERE table_name = r.table_name
AND purpose IS NOT NULL
);
$$);
Lineage for International Data Transfers
GDPR Article 44-49 governs international data transfers. If your PII flows to non-EU systems, lineage must document this:
-- Tag tables that replicate to non-EU systems
SELECT pgcomply.tag('analytics_exports', 'data_transfer', 'US - Mixpanel');
SELECT pgcomply.tag('email_queue', 'data_transfer', 'US - SendGrid');
-- Lineage view shows transfer destinations alongside PII types
SELECT l.*, t.value AS transfer_destination
FROM pgcomply.pii_lineage() l
LEFT JOIN pgcomply.get_tags(l.table_name) t ON t.key = 'data_transfer';
Any table with a data_transfer tag and PII requires either Standard Contractual Clauses (SCCs) or an adequacy decision.
Visualizing Data Flows
For non-technical stakeholders (DPO, legal, auditors), export lineage as a structured report:
-- Pro: Generate Article 30 records of processing
SELECT pgcomply.compliance_report('article_30');
For the Community Edition, build a simple view:
-- Summary: how many PII columns per purpose
SELECT purpose, legal_basis, count(*) AS pii_columns,
array_agg(DISTINCT pii_type) AS data_types
FROM pgcomply.pii_lineage()
WHERE purpose IS NOT NULL
GROUP BY purpose, legal_basis
ORDER BY pii_columns DESC;
purpose | legal_basis | pii_columns | data_types
------------+-------------+-------------+-----------------------------------
account | contract | 8 | {email,phone,person_name,address}
billing | contract | 4 | {email,financial,address}
newsletter | consent | 2 | {email,person_name}
analytics | legitimate | 1 | {ip_address}
This table is the foundation of your Article 30 record — it answers "what PII do we process, for what purpose, under which legal basis."
Summary
PII lineage turns your GDPR Article 30 obligation from a documentation exercise into a queryable database asset. pgcomply.pii_lineage() generates the map automatically from your PII registry, consent records, and audit trail. Run it before every audit to ensure your records of processing are complete and accurate.