Data Governance 101
Data compliance, privacy and security is hard because:
* There is too much data
* There is too much complexity
* There is no context to data usage.
Automation is the only hope.
This talk introduces the first steps to automate data governance tasks to answer:
* Where is my data ?
* Who has access to the data ?
* How is the data used ?
We will discuss data governance automation examples from past work for AWS Redshift, Snowflake and MySQL.
The information from these tasks will set the foundation for an effective strategy for compliance, privacy and security.
- What is Data Governance?
- Why is Data Governance hard?
- There is too much data.
- There is too much complexity.
- There is no context for data usage.
- Automation examples to ease data governance.
- Where is my data?
- Who has access to data?
- How is the data used?
Rajat Venkatesh has experience in building data warehouses and data lakes used by the largest companies in the world. He has helped data-driven companies adopt data governance processes to solve their security & privacy goals. He created a set of open source data governance tools (https://tokern.io/) to help other data teams with similar challenges.
- Tokern - Open Source Data Governance Tools: https://tokern.io
- DbAdminNews News Letter - https://dbadminnews.substack.com/
- PIICatcher - https://github.com/tokern/piicatcher/
- Data Lineage - https://github.com/tokern/data-lineage/
- LakeCLI - https://github.com/tokern/lakecli/