Data Governance Meetups

Experiential discussions on engineering for security, compliance and privacy

Atif Akhtar

Atif Akhtar

@atifakhtar1

Data Governance: Data Catalog options and lessons learned trying to do it right

Submitted Sep 23, 2020

A lot of organizations have recently started taking Data Governance seriously given the different laws now coming up in countries regarding the use of data and heavy penalties on leaks which is further exacerbated by how much more data each of these orgs are now generating compared to before.With these accelerated motives a lot of Data Governance strategies are a make or break based on the tooling of choice and priorities/trade offs considered.

In this talk we look at the major OSS options for data catalog (Apache Atlas,Marquez and , their maturity and what is the unique USP for each of those and how should you go about choosing the right one for you.

Outline

Data Catalog First principles , talk about features such as-:
Data Dictionary
Lineage
Business Glossary
Data SLAs
Ownership and Data modelling
Search and Exploration
Tagging and Compliance
Data feeds and Democratization

Key OSS catalogs and a brief overview/comparisons
Amundsen vs Atlas vs Datahub vs Marquez, how do they compare on the above

Which features should you go for as an org?
Discovery vs. Curation
Security vs democratization
Compliance and Productivity

Lessons learn (If time remaining)

Requirements

NA

Speaker bio

Big data consultant with more than 5 years of experience solutioning and engineering for large scale data platforms and systems. Have a total of 9 years of experience working in multiple domains including but not limited to building distributed systems using scala, devops and cloud native tech, blockchain with interests in IoT and Security.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures