The Fifth Elephant 2019

Gathering of 1000+ practitioners from the data ecosystem

Tickets

The last mile problem in ML

Submitted by Krishna Sangeeth KS (@hal9001) on Wednesday, 10 April 2019


Preview video

Abstract

“We have built a machine learning model, What next?”

There is quite a bit of journey that one needs to cover from building a model in Jupyter notebook to taking it to production.
I would like to call it as the “last mile problem in ML” , this last mile could be a simple tread if we embrace some good ideas.

This talk covers some of these opinionated ideas on how we can get around some of the pitfalls in deployment of ML models in production.

We would go over the below questions in detail think about solutions for them.

  • How to fix the zombie models apocalypse, a state when nobody knows how the model was trained ?
  • In Science, experiments are found to be valid only if they are reproducible. Should this be the case in Datascience as well ?
  • Training the model in your local machine and waiting for an eternity to complete is no fun. What are some better ways of doing this ?
  • How do you package your machine learning code in a robust manner?
  • Does an ML project have the luxury of not following good Software Engineering principles?

Outline

  • Discussion on some of the issues with deploying ML models to production.
  • Discussion about mlflow including a quick demo.
  • Discussion about sagemaker BYO algorithms training.
  • Discussion about packagining ML code in a robust manner.

Requirements

  • Highlevel understanding of machine learning.

Speaker bio

Hello World,

My name is Krishna Sangeeth. I am currently working as a DataScientist @ Ericsson Global AI Accelerator (GAIA) . Prior to Ericsson, I was working @ Indix as an ML Engineer. I am a passionate programmer always on the look out for learning something new. I am an opensource enthusiast and have been able to make successful contributions to some of my favorite projects such as scikit-learn , mlflow, sagemaker etc.

Github : @whiletruelearn
Twitter : @whiletruelearn

Links

Slides

https://speakerdeck.com/whiletruelearn/the-last-mile-problem-in-ml

Preview video

https://www.youtube.com/watch?v=D4g076pf6cg

Comments

  • Anwesha Sarkar (@anweshaalt) Reviewer 7 months ago

    Thank you for submitting the proposal. Submit your preview video by 20th April (latest) it helps us to close the review process.

  • Zainab Bawa (@zainabbawa) Reviewer 6 months ago

    Thanks for the details, Krishna.

    The following comments have come in on reviewing your proposal:

    1. Who is the audience for this proposed talk? Who are you trying to convince of what?
    2. The focus of the talk is unclear because the problem statement is too broad. The last mile problem in ML is a large enough problem. What is a narrower, more deeper aspect of this big challenge that you can go deep dive into?
    3. On this note, change the title of the talk to focus on something much more narrower.
    4. It is unclear why there is discussion about tools - SageMaker, MLFlow, etc. The slides are too thin and they don’t go into details.
    5. The takeways mentioned on the last slide are far too generic. We want specific insights. For e.g., what is bad code and therefore how does one recognize and rectify bad code? How do you save artifacts of different experiments safely? How does your approach of saving artifacts compare with approaches generally used? Why is your approach better than others?

    Upload revised slides by or before 21 May, incorporating the above feedback to close the decision on your proposal.

  • Krishna Sangeeth KS (@hal9001) Proposer 5 months ago (edited 5 months ago)

    Please find my thoughts on these questions as below. Apologies for the delay in replying

    1. The primary audience is folks who are trying to invest in building ML pipelines , but are not aware of the good practices in taking an ML model to production. My endevour is to make people aware of the “Hidden Technical Debt in ML”[1] and some ideas on tackling it.

    2. The ‘last mile problem in ML’ is something that i coined for the presentation, Google calls it the [1] and others call it ML Ops, AI Ops etc. The central premise that i wanted to convey is that ML code is often a thin slice and we need often an ecosystem in place for ROI. My wish was to make people aware of the problems such as
      (i) Reproducibility of ML code.
      (ii) Scalability for training.
      (iii) Robust deployment etc

    3. I could probably rename it to “Hidden Technical debt in ML” as google had called it. I am not really a fan boy of any tool or language and hence didn’t want to call it “MLFlow for doing X or Sagemaker for doing Y” etc even though i am a contributor to both of these OSS projects, what i am interested in is to make people aware of the problem which to be honest most people aren’t , and then suggest ideas that i had found to help us.

    4. My idea was to show quick demos for both MLFlow and Sagemaker(if network is available). I had already done this once at a local meetup and having a demo helps people understand it in a bettery way. The idea was to present people with problems first and then show how MLFlow or Sagemaker can help. I also wanted to show how we can implement our own custom algorithm in Sagemaker and how this same idea of moving the training code to docker containers can help in building an inhouse training infrastructure if needed.

    5. Definition of bad code would be highly subjective, it would vary from person to person, My definition would be whatever that doesn’t pile up technical debt over the time as system grows complex. My point is to simply state that an ML project intended for production doesn’t have the luxury of not following sound SE practices. It makes a lot of sense to think of following good SE practices as we are starting on the ML project rather than having it as an after thought. Again, all this is opinion and I am not claiming my approach is better than others. :-)

    [1] https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf

Login with Twitter or Google to leave a comment