Data Governance Meetups

Experiential discussions on engineering for security, compliance and privacy

Engineering for compliance and data governance is challenge for all industries and verticals. Whether it is data localization, compliance required to connect with public digital platforms, GDPR, CCPA or the impending Personal Data Protection (PDP) Bill in India, companies have to engineer solutions for:

  1. Data acquisition
  2. Data storage
  3. Re-engineering data infrastructure for privacy
  4. Data anonymisation
  5. Data security
  6. Privacy, data and security audits

Data governance meetups are peer-to-peer learning sessions where practitioners share implementation experiences and insights with participants. Participants are welcome to submit topics and presentation ideas here.

Sessions are held monthly. Thus far, we have held meetings in July, August and September, covering GDPR compliance, frameworks for automating data governance and building resources that can help us, as tech industry, to establish standards for upcoming data governance laws. Video recordings are published on https://hasgeek.com/fifthelephant/data-governance-meetups/videos

October meeting will be held on Saturday, 10 October. Atif Akhtar of ThoughtWorks will speak about Data Catalogs and how to do them right. Mayur Ralekar of Skizzle will talk about end-to-end encryption as applied to email attachments

Participation in the monthly meetings is via Zoom, for registered participants. Alternatively, you can watch the YouTube livestream on this page.

About curators: This meetup series is curated by The Fifth Elephant alumnus, Rajat Venkatesh, with active involvement of Devangana Khokhar and Shadab Siddiqui.

For queries, contact 7676332020 or email fifthelephant.editorial@hasgeek.com

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more
Atif Akhtar

Atif Akhtar

@atifakhtar1

Data Governance: Data Catalog options and lessons learned trying to do it right

Submitted Sep 23, 2020

A lot of organizations have recently started taking Data Governance seriously given the different laws now coming up in countries regarding the use of data and heavy penalties on leaks which is further exacerbated by how much more data each of these orgs are now generating compared to before.With these accelerated motives a lot of Data Governance strategies are a make or break based on the tooling of choice and priorities/trade offs considered.

In this talk we look at the major OSS options for data catalog (Apache Atlas,Marquez and , their maturity and what is the unique USP for each of those and how should you go about choosing the right one for you.

Outline

Data Catalog First principles , talk about features such as-:
Data Dictionary
Lineage
Business Glossary
Data SLAs
Ownership and Data modelling
Search and Exploration
Tagging and Compliance
Data feeds and Democratization

Key OSS catalogs and a brief overview/comparisons
Amundsen vs Atlas vs Datahub vs Marquez, how do they compare on the above

Which features should you go for as an org?
Discovery vs. Curation
Security vs democratization
Compliance and Productivity

Lessons learn (If time remaining)

Requirements

NA

Speaker bio

Big data consultant with more than 5 years of experience solutioning and engineering for large scale data platforms and systems. Have a total of 9 years of experience working in multiple domains including but not limited to building distributed systems using scala, devops and cloud native tech, blockchain with interests in IoT and Security.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more