The Fifth Elephant 2018

The seventh edition of India's best data conference

DevOps for Data Science: Experiences from building a cloud-based data science platform

Submitted by Anand Chitipothu (@anandology) on Apr 1, 2018

Section: Full talk Technical level: Intermediate Status: Rejected

Abstract

Productionizing data science applications is non trivial. Non optimal practices, the people-heavy way of the traditional approaches, the developers love for complex solutions for the sake of using cool technologies makes the situation even worse.

There are two key ingredients required to streamline this: “the cloud” and “the right level of devops abstrations”.

In this talk, I’ll share the experiences of building a cloud-based platform for streamlining data science and how such solutions can greatly simplify building and deploying data science and machine learning applications.

Outline

  • Why Productionizing data science is hard?
  • What are the hurdles?
  • Why Cloud?
  • The DevOps challanges
  • The power of abstractions
  • Optimizing for Developer Experience (DX)
  • Case studies from the rorodata platform
  • Summary

Speaker bio

Anand has been crafting beautiful software since a decade and half. He’s now building a data science platform, rorodata, which he recently co-founded. He regularly conducts advanced programming courses through Pipal Academy. He is co-author of web.py, a micro web framework in Python. He has worked at Strand Life Sciences and Internet Archive.

Links

Slides

https://anandology.com/tmp/devops-for-datascience-draft.pdf

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('You need to be a participant to comment.') }}

{{ formTitle }}
{{ gettext('Post a comment...') }}
{{ gettext('New comment') }}

{{ errorMsg }}