The eighth edition of The Fifth Elephant will be held in Bangalore on 25 and 26 July. A thousand data scientists, ML engineers, data engineers and analysts will gather at the NIMHANS Convention Centre in Bangalore to discuss:
- Model management, including data cleaning, instrumentation and productionizing data science.
- Bad data and case studies of failure in building data products.
- Identifying and handling fraud + data security at scale
- Applications of data science in agriculture, media and marketing, supply chain, geo-location, SaaS and e-commerce.
- Feature engineering and ML platforms.
- What it takes to create data-driven cultures in organizations of different scales.
1. Meet Peter Wang, co-founder of Anaconda Inc, and learn about why data privacy is the first step towards robust data management; the journey of building Anaconda; and Anaconda in enterprise.
2. Talk to the Fulfillment and Supply Group (FSG) team from Flipkart, and learn about their work with platform engineering where ground truths are the source of data.
3. Attend tutorials on Deep Learning with RedisAI; TransmorgifyAI, Salesforce’s open source AutoML.
4. Discuss interesting problems to solve with data science in agriculture, SaaS perspective on multi-tenancy in Machine Learning (with the Freshworks team), bias in intent classification and recommendations.
5. Meet data science, data engineering and product teams from sponsoring companies to understand how they are handling data and leveraging intelligence from data to solve interesting problems.
Why you should attend?
- Network with peers and practitioners from the data ecosystem
- Share approaches to solving expensive problems such as cleanliness of training data, model management and versioning data
- Demo your ideas in the demo session
- Join Birds of Feather (BOF) sessions to have productive discussions on focussed topics. Or, start your own Birds of Feather (BOF) session.
Full schedule published here: https://hasgeek.com/fifthelephant/2019/schedule
For more information about The Fifth Elephant, sponsorships, or any other information call +91-7676332020 or email email@example.com
JSFoo:VueDay 2019 sponsors:
Machine Learning Model Management with MLflow
Session type: BOF session of 1 hour
Data is the new oil and its size is growing exponentially day by day. Most of the companies are leveraging data science capabilities extensively to affect business decisions, perform audits on ML patterns, decode faults in business logic, and more. They run large number of machine learning model to produce results.
Managing ML models in production is non-trivial. The training, maintenance, deployment, monitoring, organization and documentation of machine learning (ML) models – in short model management – is a critical task in virtually all production ML use cases. Wrong model management decisions can lead to poor performance of a ML system and can result in high maintenance cost and less effective utilization. Below are the key concern for model management:
- Computational challenges: machine learning model definition and validation, decisions on model retraining, adversarial settings.
- Data management challenges: lack of a declarative abstraction for the whole ML pipeline, querying model metadata, model interpretation.
- Engineering challenges: multiple tools and frameworks make integration complex, heterogeneous skill level of users, backwards compatibility of trained Models and hard to reproduce the training result.
There are custom ML platform to address the above concerns such as FBLearner by Facebook and Michelangelo by Uber but they have their own limitations like:
- They standardize the data preparation, training and deployment loop specific to particular platform and business needs.
- They are limited to a few algorithms and frameworks.
- They tied to one company infrastructure and hard to open source.
Databricks team found above concerns as their motivation to develop MLflow as an open source and cloud agnostic machine learning model management platform. Benefits of MLflow from machine learning model management:
- Works with any ML library and language.
- They are platform independent i.e. ML models run in same way anywhere example local system or any cloud platform.
- Designed to be useful for 1 or 10000 person organisation.
Key focus area for Machine Learning Model Management with MLflow:
- Managing ML models in production is non-trivial. What are the challenges and concerns of machine learning management lifecycle?
- What is machine learning model management?
- Motivation and concepts behind introduction of MLflow
- How to solve problem of model management using MLflow?
- MLflow components
- Realtime problem and use case
Basic understating of machine learning and its workflow
Ravi Ranjan is working as Senior Data Scientist at Publicis Sapient. He is part of Centre of Excellence and responsible for building machine learning model at scale. He has worked on multiple engagements with clients mainly from Automobile, Banking, Retail and Insurance industry across geographies. In current role, he is working on Hyper-personalized recommendation system for Automobile industry focused on Machine Learning, Deep learning, Realtime data processing on large scale data using MLflow and Kubeflow.
He holds Bachelor degree in Computer Science with proficiency course in Reinforcement Learning from IISc, Bangalore.