The Fifth Elephant 2016

India's most renowned data science conference


Anti-patterns in designing machine learning systems

Submitted by Suchana Seth (@suchana) on Monday, 2 May 2016

Technical level: Advanced


The talk will focus on ML specific challenges to designing data science systems, how such systems acquire technical debt, and what we can do at design level to mitigate some of the risks.

Key takeaway
Learn how to foresee these pitfalls & design your pipelines and systems to avoid them.

This talk is intended for an audience already familiar with applying machine learning algorithms.


In this talk, we’ll cover these sources of risk to ML systems -
Data drift - how to handle feature distributions that shift with time
Post model heuristics - when and how to add heuristics to model output
Hidden downstream consumers - how to identify and plan for these
Unacknowledged data dependencies - how to identify and plan for these
Feedback loops - the good and the bad
Decision thresholds & action limits - how to keep them sane
Reproducibility - how to ensure it

Speaker bio

Suchana is a physicist-turned data scientist with 8 years of experience research, startups and product labs. She volunteers with DataKind in her free time, and mentors data-for-good projects.



Login with Twitter or Google to leave a comment