Taking ML models to production
In spite of massive ongoing improvements in machine learning and deep learning, a majority of data science teams still struggle to solve the last mile problem - taking models to production. Due to the ad-hoc nature of training iterations and lack of a standard process, the process of tracking experiments and deploying models is anything but a smooth one. Especially in the last decade, Machine Learning and Deep Learning have undergone massive improvements and a lot of open source pre-trained models are also readily available to be fine tuned and applied to solve any problem imaginable. In many cases, depending on the domain and complexity of the problem, there is generally an ensemble of models. This can quickly make the training and tracking of models chaotic. So orchestrating all these experiments with the respective metrics and having a platform to automate the whole process becomes a necessity.
As part of this talk, we’ll talk about a few practical approaches based on Continuous Delivery principles that we have followed to solve this problem. Using a mix of open-source, enterprise and in-house tools, we have built and continue to improve an ML workflow to alleviate most of the challenges encountered while productionising models, ensuring model reproducibility, trackability, ease of frequent deployment, and completion of the feedback loop post deployment.
Our talk will focus on the challenges we face in the context of ML which broadly fall under:
1. Version control of models
2. Reproducibility of models
3. Version control of training notebooks
4. Regression testing for the ML systems
5. Establishing the feedback loop (Monitoring)
The session is targeted towards individuals with basic knowledge in ML, Devops and Software Engineering practices, who are looking to build a reliable and repeatable process for training and deploying models.
- Anithapraba Kathirvel
- Rucha Kulkarni