Data Governance: Data Catalog options and lessons learned trying to do it right

Jul 2020

29 Mon

30 Tue

1 Wed

2 Thu

3 Fri

4 Sat 04:00 PM – 05:30 PM IST

5 Sun

Aug 2020

27 Mon

28 Tue

29 Wed

30 Thu

31 Fri

1 Sat 04:15 PM – 05:50 PM IST

2 Sun

Sep 2020

31 Mon

1 Tue

2 Wed

3 Thu

4 Fri

5 Sat 04:00 PM – 05:35 PM IST

6 Sun

Oct 2020

5 Mon

6 Tue

7 Wed

8 Thu

9 Fri

10 Sat 04:00 PM – 05:50 PM IST

11 Sun

All submissions

Previous Next

This submission has been added to the schedule

Data Governance: Data Catalog options and lessons learned trying to do it right

Submitted Sep 23, 2020

Session type:: Full talk - 40 mins

A lot of organizations have recently started taking Data Governance seriously given the different laws now coming up in countries regarding the use of data and heavy penalties on leaks which is further exacerbated by how much more data each of these orgs are now generating compared to before.With these accelerated motives a lot of Data Governance strategies are a make or break based on the tooling of choice and priorities/trade offs considered.

In this talk we look at the major OSS options for data catalog (Apache Atlas,Marquez and , their maturity and what is the unique USP for each of those and how should you go about choosing the right one for you.

Outline

Data Catalog First principles , talk about features such as-:
Data Dictionary
Lineage
Business Glossary
Data SLAs
Ownership and Data modelling
Search and Exploration
Tagging and Compliance
Data feeds and Democratization

Key OSS catalogs and a brief overview/comparisons
Amundsen vs Atlas vs Datahub vs Marquez, how do they compare on the above

Which features should you go for as an org?
Discovery vs. Curation
Security vs democratization
Compliance and Productivity

Lessons learn (If time remaining)

Requirements

Speaker bio

Big data consultant with more than 5 years of experience solutioning and engineering for large scale data platforms and systems. Have a total of 9 years of experience working in multiple domains including but not limited to building distributed systems using scala, devops and cloud native tech, blockchain with interests in IoT and Security.

All submissions

Previous Next

Comments

Jul 2020

29 Mon

30 Tue

1 Wed

2 Thu

3 Fri

4 Sat 04:00 PM – 05:30 PM IST

5 Sun

Aug 2020

27 Mon

28 Tue

29 Wed

30 Thu

31 Fri

1 Sat 04:15 PM – 05:50 PM IST

2 Sun

Sep 2020

31 Mon

1 Tue

2 Wed

3 Thu

4 Fri

5 Sat 04:00 PM – 05:35 PM IST

6 Sun

Oct 2020

5 Mon

6 Tue

7 Wed

8 Thu

9 Fri

10 Sat 04:00 PM – 05:50 PM IST

11 Sun

Hosted by

The Fifth Elephant

Jumpstart better data engineering and AI futures

Data Governance Meetups

Data Governance: Data Catalog options and lessons learned trying to do it right

Outline

Requirements

Speaker bio

Comments