MLOps Conference

MLOps Conference

On DataOps, productionizing ML models, and running experiments at scale.

Make a submission

Accepting submissions till 14 Jul 2021, 11:00 PM

Machine Learning (ML) is at the helm of products. As products evolve with time, so is the necessity for ML to evolve. In 2010s, we saw DevOps culture take the forefront for engineering teams. 2020s will be all about MLOps.

MLOps stands for Machine Learning Operations. MLOps mainly focuses on workflows, thought processes and tools that are used in creating ML models, and their evolution over time. The workflows for ML at organizations are different as the problem space, maturity of teams and experience with ML tools are widely different.

MLOps relies on DataOps. DataOps is about Data operations, and helps define data and SLOs for data - how they are stored, managed and mutate over time - thereby providing the foundations for sound ML. The success and failure of ML models depends heavily on DataOps, where data is well-managed and brought into the system in a well thought out manner. ML and data processes have to evolve to provide insights into the reasons as to why certain models are not behaving as before.

Productionizing ML models is a challenge, but so is running experiments at scale. MLOps caters not only to scaling ML models in production, but also helps in providing guidelines and thought processes to support rapid prototyping and research for ML teams.

MLOps Conference 2021 edition

The 2021 edition is curated by Nischal HP, Director of Data at Scoutbee.

The conference covers the following themes:

  1. Machine Learning Operations
  2. Machine Learning in Production
  3. Privacy and Security in Machine Learning
  4. Tooling and frameworks in Machine Learning
  5. Economies of Machine Learning

Speakers from Doordash, Twilio, Scribble Data, Microsoft Research Labs India, Freshworks, Aampe, Myntra, Farfetch and other organizations will share their experiences and insights on the above topics.

Schedule: https://hasgeek.com/fifthelephant/mlops-conference/schedule

Who should participate in MLOps conference?

  1. Data/MLOps engineers who want to learn about state-of-the-art tools and techniques.
  2. Data scientists who want a deeper understanding of model deployment/governance.
  3. Architects who are building ML workflows that scale.
  4. Tech founders who are building products that require ML or building developer productivity products for ML.
  5. Product managers, who are seeking to learn about the process of building ML products.
  6. Directors, VPs and senior tech leadership who are building ML teams.

Contact information: Join The Fifth Elephant Telegram group on https://t.me/fifthel or follow @fifthel on Twitter. For inquiries, contact The Fifth Elephant on fifthelephant.editorial@hasgeek.com or call 7676332020.

Hosted by

The Fifth Elephant - known as one the best #datascience and #machinelearning conference in Asia - is transitioning into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Supported by

Scribble Data builds feature stores for data science teams that are serious about putting models (ML, or even sub-ML) into production. The ability to systematically transform data is the single biggest determinant of how well these models do. Scribble Data streamlines the feature engineering proces… more

Promoted

Deep dives into privacy and security, and understanding needs of the Indian tech ecosystem through guides, research, collaboration, events and conferences. Sponsors: Privacy Mode’s programmes are sponsored by: more

Nirant K

@nirant

ML Ops for Startups

Submitted Jul 11, 2021

Context: Early Startup

While a plethora of MLOps work has been done at large data and model serving scale, this is an attempt to introduce you to different ops problems, tools and how to pick among them. In most cases, I’ll share an example to anchor the idea set further.

Based on my experiences at Verloop.io, growing from 0 to 1. Verloop is among India’s largest conversational automation platforms. We work with some of India’s most largest category defining companies to serve their users.
Talk Objective

This is a primer on how to think about your experimentation/dev and deployment process. The bulk of this talk is organised in 2 columns: problem and a recommended process or tool around that. I borrow from my experience working on ML engineering challenges e.g. keeping latencies low enough for chat to be usable, things we learnt along the way e.g. need for data versioning and so on.

Slides for the Talk : https://bit.ly/startupmlops2021

Talk Outline

  • MLOps Primer 101: Intro to layout of this talk [2-3 minutes]

    • Assumptions/Prerequisites: Team size, Skills, Organisation, Shipping [4-5 minutes
    • Level 1: DevOps: Builds, Tests and CI/CD [5-7 minutes]
      • Intent: You don’t repeat a mistake which you’ve made previously
      • Code Review: Github with the ReviewNB App
      • Setup testing for business logic and app at the very least. Add them to CI/CD
      • [Recommend] Test your ML models with automated tests like CheckList
  • Level 2 MLOps: Experiments and Dev Cycle [8-10 minutes]

    • Intent: Automated Training
    • Our 3 key elements: data, code/architecture, model weights
    • Data needs to be mise en place - ideally pipelines and ETL processes are figured out
    • Reproducible pipelines – focus on integrating with your existing infra, and it’s okay to simply pull using a Jupyter notebook
      It’s also ok to aim for more complexity, by setting up DAG tools like Airflow and ML FLow
    • Experiment Tracking: Sacred, Neptune.ai, W&B
    • Both training code and resulting models are version controlled
    • DVC.org for Data Version Control
    • Manual release- Don’t bother automating it so early
    • Manage your own releases, add SDE to your team if needed
    • [Optional] Managed compute
    • Excellent time to adopt Docker and if your company is on K8, port your service to K8 as well
  • Level 3 MLOps: Model Deployment [Total: 5-7 minutes]

    • Intent: Automated Model Deployment
    • Data pipeline gathers data without engineering time - including annotations at predictable cadence
    • Release engineering: Adopt what your engineering team is doing: Canary or Blue-Green deployment
    • Setup infra to A/B test your models or blends of them in shadow mode
    • Concept, Vocabulary, Label Drift: Don’t bother. Your data pipeline will update your models for you
  • Questions from Audience [3-5 minutes]

This is a process/maturity example and recommendation of when you should solve for a specific problem within MLOps. It’s not a hard and fast guide and you can always solve some problems sooner or later.

I use the Microsoft MLOps Model as a reference.

Some experience with the following would be useful for this talk:

  • Working in a team for 2-10 data folks, across modeling, engineering, monitoring, deployment and so on
  • Deploying models with specific requirements, e.g. low latency, high throughput, large data volumes at inference (more than 1TB)

Speaker Info:

Nirant has worked across startups and MNCs in Machine Learning and Data Science roles. These include:

  • Verloop - Natural Language Process - Conversational Automation for Enterprises
  • Soroco - Computer Vision: Image Segmentation - building Search for Enterprise Documents
  • Samsung Research at the Advanced Technologies Lab - Senor Fusion & Event Classification
  • Belong.co (NLP/Predictive Analytics)

At his present role in Verloop.io, he focused on Conversational AI

He has written a book on Practical NLP for Developers (Published by Packt). This book is a Quickstart Guide for Developers interested in building NLP based solutions, without the patience for pedantic learning on Linguistics and Deep Learning.

Recognition & Contributions

  • Won the Kaggle NLP Kernel Prize from Kaggle and Explosion.AI (makers of spacy.io)
    Lead Maintainer for awesome-nlp with ~11K+ stars - recommended by Andrew Ng’s Deep Learning course CS229 at Stanford
  • GitHub’s official Machine Learning collection includes awesome-nlp as world’s best NLP resource
  • FastAI International Fellowship: 2018 & 2019

Talks

  • PyCon India 2019: http://bit.ly/pycon2019talk (Google Slides Talk)
  • inMobi Tech Talks: A Nightmare on the LM Street; Slides
  • Wingify DevFest: NLP for Indian Languages; Slides, Video
  • PyData Bengaluru Inaugural Talk: Video, Resources

Personal Website: https://nirantk.com
Twitter: https://twitter.com/NirantK/
Github: https://github.com/NirantK
LinkedIn: https://linkedin.com/in/nirant
Book: https://www.amazon.in/dp/B07L3PLQS1/

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Make a submission

Accepting submissions till 14 Jul 2021, 11:00 PM

Hosted by

The Fifth Elephant - known as one the best #datascience and #machinelearning conference in Asia - is transitioning into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Supported by

Scribble Data builds feature stores for data science teams that are serious about putting models (ML, or even sub-ML) into production. The ability to systematically transform data is the single biggest determinant of how well these models do. Scribble Data streamlines the feature engineering proces… more

Promoted

Deep dives into privacy and security, and understanding needs of the Indian tech ecosystem through guides, research, collaboration, events and conferences. Sponsors: Privacy Mode’s programmes are sponsored by: more