The Drift Problem
Your PII registry is perfect today. But schemas change constantly: new features add columns, migrations restructure tables, developers create temporary tables that become permanent. Without drift detection, your compliance posture degrades with every deployment.
Running Drift Detection
SELECT * FROM pgcomply.schema_drift();
Output:
drift_type | table_name | column_name | suspected_pii | confidence | recommendation
-------------+------------------+----------------+---------------+------------+----------------------------
new_column | users | mobile_phone | phone | high | Register with register_pii()
new_table | newsletter_subs | email_address | email | high | Classify and register PII
stale_reg | temp_imports | ssn | government_id | high | Table dropped, remove registration
Three types of drift:
- new_column: A column matching PII patterns exists but is not registered
- new_table: A new table has columns matching PII patterns
- stale_reg: A registered PII column no longer exists (table or column was dropped)
Pattern Matching
pgcomply checks column names against patterns for:
| PII Type | Patterns Matched | |----------|-----------------| | email | email, e_mail, email_address, mail | | phone | phone, mobile, tel, telephone, fax | | name | name, first_name, last_name, full_name | | address | address, street, city, zip, postal | | ip_address | ip, ip_addr, ip_address, client_ip | | government_id | ssn, tax_id, national_id, passport | | financial | iban, account_number, card_number | | date_of_birth | birth, dob, birthday, date_of_birth |
Automating Detection
Scheduled (Weekly)
SELECT pgcomply.schedule_jobs(install := true);
-- Adds: weekly drift detection every Monday at 07:00
CI/CD Integration
# In your deployment script
DRIFT=$(psql -t -c "SELECT count(*) FROM pgcomply.schema_drift() WHERE confidence = 'high'")
if [ "$DRIFT" -gt 0 ]; then
echo "ERROR: Unregistered PII detected. Register before deploying."
exit 1
fi
Resolving Drift
When drift is detected:
-- Register the new PII column
SELECT pgcomply.register_pii('users', 'mobile_phone', 'phone', 'user_id');
-- Re-classify data levels
SELECT pgcomply.auto_classify();
-- Run drift check again to verify
SELECT * FROM pgcomply.schema_drift();
-- Should return empty
Real-World Drift Scenarios
Scenario: The Emergency Hotfix
At 2 AM, a developer adds a column to fix a production issue:
ALTER TABLE orders ADD COLUMN customer_phone TEXT;
-- Fixed the shipping notification bug. Back to bed.
Next Monday, the weekly drift check catches it:
SELECT * FROM pgcomply.schema_drift();
drift_type | table_name | column_name | suspected_pii | confidence
------------+------------+----------------+---------------+-----------
new_column | orders | customer_phone | phone | high
Without drift detection, this column would remain unregistered indefinitely — invisible to forget(), not included in inspect(), and not masked for analytics roles.
Scenario: The New Developer
A new team member creates a staging utility table:
CREATE TABLE tmp_user_export (
user_id TEXT,
full_name TEXT,
email_addr TEXT,
home_address TEXT,
phone_mobile TEXT
);
Drift detection flags five PII columns in a single table:
drift_type | table_name | column_name | suspected_pii | confidence
------------+-----------------+---------------+---------------+-----------
new_table | tmp_user_export | full_name | person_name | high
new_table | tmp_user_export | email_addr | email | high
new_table | tmp_user_export | home_address | address | high
new_table | tmp_user_export | phone_mobile | phone | high
Scenario: Stale Registrations
A table was dropped during a migration but its PII registration remains:
drift_type | table_name | column_name | suspected_pii | confidence
-------------+-----------------+-------------+---------------+-----------
stale_reg | legacy_contacts | email | email | high
Clean up:
DELETE FROM pgcomply.pii_registry WHERE table_name = 'legacy_contacts';
Customizing Detection Patterns
pgcomply detects PII by column name patterns. If your codebase uses non-standard naming:
-- Check what patterns pgcomply uses
-- Column names containing these substrings trigger detection:
-- email, mail, phone, mobile, tel, name, first_name, last_name,
-- address, street, city, zip, postal, ip, ip_addr, ssn, tax_id,
-- national_id, passport, iban, account_number, card, birth, dob,
-- salary, income, wage
-- If your columns use different conventions (e.g., 'kontakt_telefon'),
-- register them manually:
SELECT pgcomply.register_pii('kontakte', 'kontakt_telefon', 'phone', 'kontakt_id');
Summary
Schema drift detection is the difference between compliance that works today and compliance that works tomorrow. Run pgcomply.schema_drift() after every deployment, and automate it weekly as a safety net. The cost of catching drift early is one SQL call; the cost of missing it is an audit finding.