The Fifth Elephant 2019

Gathering of 1000+ practitioners from the data ecosystem

Tickets

BoF on ML platforms

Submitted by Venkata Pingali (@pingali) via Abhishek Balaji (@booleanbalaji) on Wednesday, 17 July 2019

Session type: Birds of a Feather session of 1 hour Session type: BOF session of 1 hour

View proposal in schedule

Abstract

On machine learning platforms, journeys in building them, and managing infrastructure for ML platforms

Outline

The purpose of this BoF is to have a conversation around platforms that
organizations are building develop and deploy ML models. We will discuss a
number of practical challenges in developing and deploying ML Platforms

We will touch upon :

(a) Whether organizations need one and when?
(b) What should it achieve? What is it value proposition?
(c) What is it relationship to cloud offerings such Azure ML?
(d) How should one go about developing one?
(e) How should one think about technology/other choices?
(f) What the challenges in developing and operating one?

Specifically we will discuss

(a) Data flows - stability, scaling, changing requirements
(b) Team structure/skill requirements and availability
(c) Development Support - Notebooks, production vs test, realtime vs batch
(d) Life cycle management - Planning, deployment, evolution
(e) Operations - Monitoring, debugging, evolution to latest tooling
(f) Pressures - Balance of need to deliver vs need to architecture
(g) Processes - For development efficiency, correctness
(h) Data Governance - access and data copy management, privacy
(i) Scaling - how to grow with data sets, number of models, computational requirements, diversity?

Requirements

Interest in productionization of machine learning

Speaker bio

Participants:

  • Krupal Modi, Director of Machine Learning, Haptik
  • Subir Mansukhani, India Head, Domino Data Labs
  • Ravi SK, Sr. Architect, Walmart Labs
  • Soumya Simanta, Principal Architect, Swiggy
  • Venkata Pingali, CEO, Scribble Data (Moderator)

Comments

  • krupal Modi (@superkrups) 4 months ago (edited 4 months ago)

    Hi,
    Following are few additional subtopics related to above topics which can be potentially discussed
    1) Data flows - Security concerns involved in making data available for data scientists. Privacy concerns in getting data tagged.
    2) Team structure - What technical skillsets are required in end to end life cycle of ML project and how are teams organised at different organisations? Workflow between data engineer, data scientist, devops and ml-ops.
    3) Planning for productionization - What practices can help optimise the path of moving from Jupyter notebooks to production ready service? What practices can help ensure that training-serving skew is minimised?
    4) Operations including monitoring, debugging - Best ml-ops practices to be followed. Setting up systems to detect training serving skew. Challenges in closing the loop. - Refer to this video to understand more about ml-ops practices- https://www.youtube.com/watch?v=20h_RTHEtZI&t=15
    5) Pressures - Balance of need to deliver vs need to architecture - How do you keep up with continuousely evolving deployment practices? Admist fast evolving projects and frameworks such as kubeflow, kubeflow fairing, seldon core, etc and quickly emerging managed solutions, it’s always tempting to rearchitect but when is it truely required and how to measure the value out of it?
    6) Accompanying process and standards - How to measure / increase productivity of data science teams at different stages of the company? What processes and standard practices can ensure maximum productivity?

    • Venkata Pingali (@pingali) 4 months ago

      Thanks! Incorporated.

  • Ravishankar Suribabu (@ravishankarks) 4 months ago

    Some additions could be taken

    Managing Infrastructure for Machine Learning Platform
    Value proposition of ML Platform
    Infrastructure Needs for Walmart MLPlatform
    Do ML Platforms evolve ??
    Technical stack choices
    Why not use available choices such as Google MLE or Azure ML as infrastructure - Why Customize our own
    Our Lessons Learnt
    How to handle scale

    • Venkata Pingali (@pingali) 4 months ago

      Incorporated your suggestions. Please do join us for the conversation.

  • Krishna Durai (@krishnadurai) 4 months ago
    • Why Kubernetes for ML?
    • Complexity of workflows for model training and evaluation
    • Cost of maintenance: OSS software
    • Kubernetes makes things modular and composable
    • Kubernetes makes it easier to deploy and scale
    • Downside of Kubernetes
    • Venkata Pingali (@pingali) 4 months ago

      There will definitely be some discussion on Kubernetes but the BoF itself is not a forum for debating Kubernetes. There are others. The purpose is to get a conversation going around ML platforms and how to think about them.

      • Krishna Durai (@krishnadurai) 4 months ago

        @pingali Please ignore the above comment. I have posted this under the wrong BoF.

        • Venkata Pingali (@pingali) 4 months ago

          Ok. All worth discussing, though. :)

Login with Twitter or Google to leave a comment