The Fifth Elephant Winter edition starts at 9:30 am; live stream for members This update is for participants only
##The eighth edition of The Fifth Elephant will be held in Bangalore on 25 and 26 July. A thousand data scientists, ML engineers, data engineers and analysts will gather at the NIMHANS Convention Centre in Bangalore to discuss:
- Model management, including data cleaning, instrumentation and productionizing data science.
- Bad data and case studies of failure in building data products.
- Identifying and handling fraud + data security at scale
- Applications of data science in agriculture, media and marketing, supply chain, geo-location, SaaS and e-commerce.
- Feature engineering and ML platforms.
- What it takes to create data-driven cultures in organizations of different scales.
1. Meet Peter Wang, co-founder of Anaconda Inc, and learn about why data privacy is the first step towards robust data management; the journey of building Anaconda; and Anaconda in enterprise.
2. Talk to the Fulfillment and Supply Group (FSG) team from Flipkart, and learn about their work with platform engineering where ground truths are the source of data.
3. Attend tutorials on Deep Learning with RedisAI; TransmorgifyAI, Salesforce’s open source AutoML.
4. Discuss interesting problems to solve with data science in agriculture, SaaS perspective on multi-tenancy in Machine Learning (with the Freshworks team), bias in intent classification and recommendations.
5. Meet data science, data engineering and product teams from sponsoring companies to understand how they are handling data and leveraging intelligence from data to solve interesting problems.
##Why you should attend?
- Network with peers and practitioners from the data ecosystem
- Share approaches to solving expensive problems such as cleanliness of training data, model management and versioning data
- Demo your ideas in the demo session
- Join Birds of Feather (BOF) sessions to have productive discussions on focussed topics. Or, start your own Birds of Feather (BOF) session.
##Full schedule published here: https://hasgeek.com/fifthelephant/2019/schedule
For more information about The Fifth Elephant, sponsorships, or any other information call +91-7676332020 or email email@example.com
Kubeflow: ML on Kubernetes
Session type: Full talk of 40 mins
Data science software teams find it tedious to implement ML workflows in a repeatable, maintainable and sustainable manner. Even if such a platform is developed, it has challenges with further inclusion of newer workflows or capabilities, portability across various infrastructure platforms (cloud, on-premise, and hybrid), scalability in terms of compute resources, and managing the number of teams using the platform.
In this talk, participants will learn about the Open Source Machine Learning Platform called Kubeflow. The Kubeflow project is “dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable”. Anywhere you are running Kubernetes, you should be able to run Kubeflow for your ML workloads. Through the live demo, participants will learn to use Kubeflow to create pipelines of different tasks which reflect their day to day ML tasks by using a Jupyter Notebook. The demo example will cover several components of a data-scientist’s day to day tasks including data pre-processing, training a model by first tuning hyperparameters through Katib, evaluating the model against test data and deploying it to serve predictions.
- Machine Learning is hard, maintaining is tougher (integrating with legacy systems, portability of the platform compared to other vendors)
- Kubernetes provides infrastructure extensibility
- Composability, portability and scalability on top of Kubernetes
- Acquiring Kubernetes skills to develop on may be challenging, hence the open source way!
- Develop, deploy and manage portable distributed ML on Kubernetes
- Features of Kubeflow: right from developing ML pipelines with hyperparameter tuning, training and serving with the help of Jupyter Notebook
- Pipeline example demo about TF MNIST (Jupyter Notebook) with hyperparameter tuning, training and serving
- Benefits: Democratizing Machine Learning - Show real life impact and social cause
- Who’s contributing?
- What’s next in Kubeflow?
- Pitch about being open / open source development
- About Community - Why? What? How? etc
- Contacts for reaching out to contribute or know about Kubeflow
BTech Computer Science from Visvesvaraya National Institute of Technology, Nagpur
Krishna currently works as an open source developer for Kubeflow, the platform which this presentation is about, under the Cisco AI Cloud CTO Team. Cisco AI, as a group, are ranked third in the number of contributions by lines of code to Kubeflow (http://devstats.kubeflow.org/d/5/companies-summary?orgId=1).
Krishna has an experience of 3 years in designing and engineering AI platforms having previously worked with 3 different start-ups, including SigTuple, an AI based medical analysis platform which developed a platform called ‘Kurma’. Kubeflow solves the same problems which Kurma addresses in a sustainable manner with Kubernetes as its infrastructure layer. This transformation from proprietary software for ML to open source versions of it helps him draw a picture of the paradigm shift which we faced as developers, trying to solve the same problems within the bounds of our firm.
- Author’s LinkedIn: https://www.linkedin.com/in/krishnadurai/Author’s
- About Kubeflow: https://www.kubeflow.org
- Kubeflow GitHub: https://github.com/kubeflow/kubeflow
- Kubeflow blog - A year in perspective: https://medium.com/kubeflow/kubeflow-in-2018-a-year-in-perspective-49c273b490f4
- Kubeflow Demo Material: https://github.com/CiscoAI/KFLab/tree/master/tf-mnist, https://github.com/CiscoAI/KFLab/tree/master/pipelines/tf-mnist