The Fifth Elephant 2018

The seventh edition of India's best data conference

DevOps for Data Science: Experiences from building a cloud-based data science platform

Submitted by Anand Chitipothu (@anandology) on Saturday, 31 March 2018

videocam_off

Technical level

Intermediate

Section

Full talk

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +2

Abstract

Productionizing data science applications is non trivial. Non optimal practices, the people-heavy way of the traditional approaches, the developers love for complex solutions for the sake of using cool technologies makes the situation even worse.

There are two key ingredients required to streamline this: “the cloud” and “the right level of devops abstrations”.

In this talk, I’ll share the experiences of building a cloud-based platform for streamlining data science and how such solutions can greatly simplify building and deploying data science and machine learning applications.

Outline

  • Why Productionizing data science is hard?
  • What are the hurdles?
  • Why Cloud?
  • The DevOps challanges
  • The power of abstractions
  • Optimizing for Developer Experience (DX)
  • Case studies from the rorodata platform
  • Summary

Speaker bio

Anand has been crafting beautiful software since a decade and half. He’s now building a data science platform, rorodata, which he recently co-founded. He regularly conducts advanced programming courses through Pipal Academy. He is co-author of web.py, a micro web framework in Python. He has worked at Strand Life Sciences and Internet Archive.

Links

Slides

https://anandology.com/tmp/devops-for-datascience-draft.pdf

Comments

  • 1
    Zainab Bawa (@zainabbawa) Reviewer 8 months ago

    How will this talk not come across as a pitch for RoRodata? You will talk about the solution you have built. I am more interested in knowing what is the larger problem space and what options are available beyond the solution you built.

  • 1
    Anand Chitipothu (@anandology) Proposer 8 months ago

    I can completely omit the part “Case studies from the rorodata platform” if you think this feels like a pitch. I think I have lot of interesting insights to share in this space in general, like embracing the cloud, importance of developer experience, the power of abstractions etc.

  • 1
    Anand Chitipothu (@anandology) Proposer 7 months ago (edited 7 months ago)

    I’ve submitted the draft slides. I’ve included a case-study from one of the systems that I’ve built. The case study talks about the problem, the approch we’ve taken and a disscussion to just the architectural choices made. There is no mention of rorodata except in the introduction slide.

    I’m planning to make the following improvements to the draft:

    • Improve the flow
    • Add another case study
    • Add a section explaining why buliding systems like this is hard, showing pitfalls and anti-patterns
    • Expand the Developer Exprience section a bit

Login with Twitter or Google to leave a comment