Speak at The Fifth Elephant 2026 Annual Conference
Share you work with the community
Jul 2026
13 Mon
14 Tue
15 Wed
16 Thu
17 Fri 09:00 AM – 06:00 PM IST
18 Sat 09:00 AM – 06:00 PM IST
19 Sun
Shril Kumar
@shril
Submitted Jun 20, 2026
You might have submitted request to Google or Meta (Facebook) to delete your personal data. Delete this customer’s data sounds like a one-line request. At petabyte scale it can spawn a job that runs for days and burns serious compute. Once data lands in cloud object storage (S3, GCS, Azure Blob) as immutable files, a single GDPR or CCPA deletion request means rewriting and repacking those files, chasing the same IDs as they spread into copies and downstream tables, then proving to auditors that every copy is gone. As privacy law tightens and AI pipelines fan personal data into ever more derived datasets, brute-force deletion is becoming both fragile to operate and hard to justify on cost. This session makes a simple argument: don’t delete the data, cut the thread that ties it to a person.
The core design pattern is Identifier Severance: store every record against a stand-in “virtual” ID, keep the lookup table that links it to the real account isolated, and “delete” someone by destroying their row in that table, orphaning the historical records with no path back to the individual, and collapsing a lake-wide rewrite into a small, fast, audited change.
The secondary pattern for the most sensitive fields, we layer on Crypto-Shredding: encrypt each subject’s data under its own key and destroy the key on request. The concept is the easy part; this talk digs into where it breaks, why severed-link data can still be personal data, why the guarantee collapses if a single copy of the old mapping survives in a backup, replica, or log, and how residual signals like device or location can still re-identify someone. You’ll leave knowing when severing the link is enough and when you have to destroy the keys outright.
Data platform owners, data engineers, and architects running compliant enterprise platforms, anyone who has to honor GDPR / CCPA erasure at lake scale across immutable storage and a sprawl of derived datasets. No privacy-law background needed; comfort with the basics of a data lake is enough.
Shril Kumar is a Senior Software Engineer working @Roku.
He works for the Ad-Tech Data Platform which powers 100M+ Roku devices worldwide. He has 8 years of work-experience, and prior to this, he helped build Marketing Platform at Groupon.
{Add the link to draft slides - PDF/PPT - with comments access}
{Add the link to 2-min elevator pitch video}
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}