Nov 2019
18 Mon
19 Tue
20 Wed
21 Thu
22 Fri
23 Sat 08:30 AM – 05:30 PM IST
24 Sun
Kurian Benoy
In this talk we will discuss about the current practices of organizing ML projects using traditional open-source tool set like Git and Git-LFS as well as this tool set limitation.
Thereby motivation for developing new ML specific version control systems will be explained.
Currently the life-cycle of any Machine learning model goes through following process:
Git can’t handle large amount of data of GB’s of size. While Git-LFS comes with the in-build difficulty of supporting only 2 GBs of data at the maximum.(github limitations)
Data Version Control or DVC.ORG is an open source, command-line tool written in Python. We will show how to version datasets with dozens of gigabytes of data and version ML models, how to use your favorite cloud storage (S3, GCS, or bare metal SSH server) as a data file backend and how to embrace the best engineering practices in your ML projects.
Talk Outline
Talk Outline
Kurian Benoy is an open source contributor at CloudCV, DVC. He is the lead organiser of School of AI, Kochi and is an AI enthusiast working on Deep Learning and Computer Vision. Kurian is FOSSASIA Open TechNights WInner and gave a talk in FOSSASIA Open Tech submit about the [keralarescue.in team] (https://www.youtube.com/watch?v=2RzImb5JwMA).
Kurian has been contributing to DVC for the past few months and has been a top 10 contributor to DVC.org and made an introductory kaggle kernel about dvc
https://docs.google.com/presentation/d/16mbu71NqNH6ULPJWSMDheYwolRrIn1sSLD8JYy9s4ks/edit?usp=sharing
Hosted by
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}