The Fifth Elephant 2016

India's most renowned data science conference

Anti-patterns in designing machine learning systems

Submitted by Suchana Seth (@suchana) on Monday, 2 May 2016

videocam_off

Technical level

Advanced

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +25

Abstract

The talk will focus on ML specific challenges to designing data science systems, how such systems acquire technical debt, and what we can do at design level to mitigate some of the risks.

Key takeaway
Learn how to foresee these pitfalls & design your pipelines and systems to avoid them.

This talk is intended for an audience already familiar with applying machine learning algorithms.

Outline

In this talk, we’ll cover these sources of risk to ML systems -
Data drift - how to handle feature distributions that shift with time
Post model heuristics - when and how to add heuristics to model output
Hidden downstream consumers - how to identify and plan for these
Unacknowledged data dependencies - how to identify and plan for these
Feedback loops - the good and the bad
Decision thresholds & action limits - how to keep them sane
Reproducibility - how to ensure it

Speaker bio

Suchana is a physicist-turned data scientist with 8 years of experience research, startups and product labs. She volunteers with DataKind in her free time, and mentors data-for-good projects.

Links

Comments

Login with Twitter or Google to leave a comment