arrow_back Optimisation using Julia
The last mile problem in ML arrow_forward
Machine Learning Model Management with MLflow
Submitted by Ravi Ranjan (@raviranjan03) on Wednesday, 10 April 2019
Session type: Tutorial
Data is the new oil and its size is growing exponentially day by day. Most of the companies are leveraging data science capabilities extensively to affect business decisions, perform audits on ML patterns, decode faults in business logic, and more. They run large number of machine learning model to produce results.
Managing ML models in production is non-trivial. The training, maintenance, deployment, monitoring, organization and documentation of machine learning (ML) models – in short model management – is a critical task in virtually all production ML use cases. Wrong model management decisions can lead to poor performance of a ML system and can result in high maintenance cost and less effective utilization. Below are the key concern for model management:
- Computational challenges: machine learning model definition and validation, decisions on model retraining, adversarial settings.
- Data management challenges: lack of a declarative abstraction for the whole ML pipeline, querying model metadata, model interpretation.
- Engineering challenges: multiple tools and frameworks make integration complex, heterogeneous skill level of users, backwards compatibility of trained Models and hard to reproduce the training result.
There are custom ML platform to address the above concerns such as FBLearner by Facebook and Michelangelo by Uber but they have their own limitations like:
- They standardize the data preparation, training and deployment loop specific to particular platform and business needs.
- They are limited to a few algorithms and frameworks.
- They tied to one company infrastructure and hard to open source.
Databricks team found above concerns as their motivation to develop MLflow as an open source and cloud agnostic machine learning model management platform. Benefits of MLflow from machine learning model management:
- Works with any ML library and language.
- They are platform independent i.e. ML models run in same way anywhere example local system or any cloud platform.
- Designed to be useful for 1 or 10000 person organisation.
Key focus area for Machine Learning Model Management with MLflow:
- Managing ML models in production is non-trivial. What are the challenges and concerns of machine learning management lifecycle?
- What is machine learning model management?
- Motivation and concepts behind introduction of MLflow
- How to solve problem of model management using MLflow?
- MLflow components
- Realtime problem and use case
Basic understating of machine learning and its workflow
Ravi Ranjan is working as Senior Data Scientist at Publicis Sapient. He is part of Centre of Excellence and responsible for building machine learning model at scale. He has worked on multiple engagements with clients mainly from Automobile, Banking, Retail and Insurance industry across geographies. In current role, he is working on Hyper-personalized recommendation system for Automobile industry focused on Machine Learning, Deep learning, Realtime data processing on large scale data using MLflow and Kubeflow.
He holds Bachelor degree in Computer Science with proficiency course in Reinforcement Learning from IISc, Bangalore.
Subarna Rana is a lead Data Scientist and an innovator. He is part of PublicisSapient’s core data science team and is responsible for building models by applying state of the art techniques in the field of Machin learning and Deep Learning. He is an experienced data science professional specializing in building and managing data products from conceptualization to deployment phase and interested in solving challenging machine learning problems.
He has worked on various machine learning projects involving predictive modeling, forecasting, optimization, image recognition, recommendation engines and natural language processing. He holds a masters degree in this field from University of Southampton.
While not working on official projects, he involves himself in technical writing and blogging. He also contributes to the open source world by creating packages, answering technical questions. He enjoys participating and competing in open data science challenges.