Tickets

Loading…

Akash Khandelwal

@akash099

Maintaining Data Pipelines' Sanity at Scale : How Validations and Metric Visualization came to our rescue!

Submitted Apr 15, 2019

Have you ever been through a nightmare when corrupt data from an upstream source led to a rogue index push to prod?

In this talk, I’ll walk through via case studies from our work at Flipkart :

  1. Writing test cases for data pipelines. Validating datasets and generated patterns in addition to business logic.
  2. Capturing and visualizating important metrics, and alerting. In-Lab and External recurring evaluation.
  3. Brining Order to Chaos. Dealing With Staleness and Volume Drop.

Outline

https://docs.google.com/presentation/d/1IgTCvBB3Hja51cFrU3n2kDuvj7oHcOcCd3mANlxZtzU/edit#slide=id.p

Requirements

NA

Speaker bio

Akash is a software developer with Search Autosuggest team at Flipkart. Previously, he has worked on building Flipkart Recommendation System. He designed real time and batch pipelines to power recommendations, including use cases such as product bundling, similar products and personalisation. He is interested in applying Machine Learning for pattern mining, and deploying data processing pipelines at scale. He graduated with a dual degree in Computer Science & Engineering from IIT Delhi.

Slides

https://docs.google.com/presentation/d/1IgTCvBB3Hja51cFrU3n2kDuvj7oHcOcCd3mANlxZtzU/edit#slide=id.p

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid access (members only)

Hosted by

Jump starting better data engineering and AI futures