Submit a talk on data

Submit talks on data engineering, data science, machine learning, big data and analytics through the year – 2019

Tickets Propose a session

Data Lineage Service at Slack

Submitted by Atl Arredondo (@atlarredondo) on Friday, 31 May 2019

Preview video

Session type: Full talk of 40 mins


At Slack, the data engineering team has built tools that allow engineers and other people in the company to create their own data pipelines, run interactive queries and build dashboards. Over time the data volume, the number of datasets and the dependencies between them has increased. This has made data discovery hard and impacted the reliability and trust of our datasets. In addition, incidents have become harder to debug and assets due to lack of visibility on the dataset dependency graph and data consumers. Capturing lineage data can give us the knowledge necessary to build an application that can expose data dependencies and automate processes.
In this talk, we will go through the development process of our Data Lineage service, our technical challenges and the future of this service at Slack.


Slack Data Infrastructure
Problems with Data Flow Visibility
Data Lineage
SQL Parser
Service API
Data Lineage Applications

Speaker bio

My name is Atl Arredondo and I work as a Data Engineer at Slack.
I have been working at Slack for the past two and half years building core datasets and tooling to improve data discovery and consumption.
During the past two quarters I have been working with my team to develop our internal Data Lineage Service in order to add visibility into our data dependency flow.


Preview video


  • Abhishek Balaji (@booleanbalaji) Reviewer 2 months ago

    Thanks for your submissions, Atl. I’ve moved this for evaluation under our upcoming conference - The Fifth Elephant 2019 in Bangalore, India. We’ll share this proposal with our review team and get back to you with feedback and comments.

Login with Twitter or Google to leave a comment