Submissions

Submissions for MLOps November edition

On ML workflows, tools, automation and running ML in production

Accepting submissions till 01 Nov 2021, 11:59 PM

Not accepting submissions

We are accepting experiential talks and written content on the following topics: ML development workflows. ML deployment frameworks. Data lineage. Model lineage. Model ethics/bias testing. A/B testing frameworks. expand

We are accepting experiential talks and written content on the following topics:

  1. ML development workflows.
  2. ML deployment frameworks.
  3. Data lineage.
  4. Model lineage.
  5. Model ethics/bias testing.
  6. A/B testing frameworks.
  7. Model governance.
  8. Explainability/interpretability of models in run-time.
  9. Impact of change in MLOps mindset in product organizations.
  10. DataOps workflows.
  11. DataOps frameworks.
  12. Alerting, monitoring and managing models in production.
  13. Growing and managing data teams.
  14. MLOps in research.
  15. Deployment and infrastructure for machine learning.
  16. ROI (Return on Investment) for MLOps.

Who should speak?

  1. MLOps engineers who build and maintain ML workflows and deploy ML models.
  2. Data engineers building production scale data pipelines, feature stores, model dashboards, and model maintenance.
  3. Tech leaders/engineers/scientists/product managers of companies who have built tools and products for ML productivity.
  4. Tech leaders/engineers/scientists/product managers of companies who have built tools, products, processes for Data Ops to support ML.
  5. Tech Leaders/engineers/scientists/product managers who have experience with products that failed to make a mark in the market due to ML failures.
  6. Investors who are investing in the space of ML productivity tools, frameworks and landscape.
  7. Privacy/ethics stakeholders involved in model governance and testing for ethics/bias.

Content can be submitted in the form of:

  • 15 minute talks
  • 30 minute talks
  • 1,000 word written articles

All content will be peer-reviewed by practitioners from industry.

Aniruddha Chodhury

Productionize engineered data with feature store in Kubeflow Orchestration

In this session we will find out the building the feature store pipeline in Kubeflow with serving endpoint and alongside using the spark job with batch and real time kafka ingestion for offline and online extraction in Google Cloud Platform. And we will be tracking the metrics for the feature store in Grafana and Prometheus. more
  • 1 comment
  • Under evaluation
  • 23 Jun 2021

Krishna Gogineni

ROI of building internal MlOps vs adopting open-source vs buying managed options

“To build or to buy?”" - That is the question which will be explored in this session. I will compare and contrast end-to-end managed MlOps offerings like H2O.ai and sagemaker vs Building your own platform from established components vs Mixing and matching components from managed, opensource and self-built sources. As a part of this exercise, I will also cover the current state of the ecosystem in… more
  • 3 comments
  • Under evaluation
  • 30 Jun 2021

Biswa Singh

CapFlow: A scalable ML framework for training and serving CRM machine learning model operations

Preface In Capillary Technologies, we have around 30 different models to serve different use cases like recommendations, personalization, insight, decision-making, and several other retail CRM predictions. We have hundreds of customers and terabytes of data that we consume to train and serve these models through online and offline inference. To scale to such a level and cater to continuous traini… more
  • 7 comments
  • Under evaluation
  • 01 Jul 2021

Sudeep Gupta

Empowering Data Scientists at Farfetch to GoFar with PaaS

Machine Learning is a strategic goal at Farfetch, and enabling Data Scientists to be more productive is a key objective to achieve it. Cloud Providers like AWS, Google, Azure have a lot of services/products which offer various capabilities to create value in ML and Data space. But often the problem with using these services, or building one such Product from scratch is that the key stakeholders -… more
  • 9 comments
  • Submitted
  • 29 Jun 2021

Santosh Kumar

Cap-auto-feature: Scalable feature store for training CRM machine learning model operations

Introduction Feature engineering is the heart of modeling, especially for tabular datasets. While modelling it’s often a good idea to add historical data on top of the contextual data, this makes data more rich and robust for all kinds of machine learning problems. Using centralized feature engineering we can achieve this. A good feature impacts the results of the model significantly. It can help… more
  • 4 comments
  • Submitted
  • 08 Jul 2021

Vinodh Kumar R

Handling Bias while building ML systems

At Eightfold.ai our mission has been to help enable the right career for everyone using the power of AI. We employ deep learning algorithms that leverage information from career data of 1 billion+ profiles - these algorithms in turn help organizations find the most relevant talent and individuals identify the best career options for themselves. more
  • 2 comments
  • Under evaluation
  • 05 Jul 2021

Anay Nayak

Monitoring Data Quality at Scale

Level : Beginner Timing: 15 min Abstract Data drift and data cascades are real problems that can wreck havoc with any business insights. When operating with data at scale and dealing with external systems, any changes in data can cause cascading impact through all the data pipelines which are difficult to trace and incur significant cost for correcting data. Data quality frameworks like Deequ / G… more
  • 0 comments
  • Submitted
  • 14 Jul 2021

Praveen Dhinwa

ML Infrastructure for Feed Recommendations at ShareChat

Overview In this talk, we will describe ShareChat’s feed recommendation infrastructure in detail. The talk will delve into various ml-infrastructure related aspects, such as model training, serving, design, and development of feature-store, and also feature-computation pipelines. The subject matter will also provide insights and learnings that we have obtained via building these large-scale, low … more
  • 1 comment
  • Submitted
  • 14 Jul 2021

Kartikeya Sharma

Production Grade DataOps Framework for Building Intelligence Over User-Content Interaction Data

Production Grade DataOps Framework for Building Intelligence Over User-Content Interaction Data more
  • 0 comments
  • Submitted
  • 14 Jul 2021

Ramjee Ganti

ML Fairness 2.0 - Intersectional Group Fairness

Topic: As more companies adopt AI, more people question the impact AI creates on society, especially on algorithmic fairness. However, most metrics that measure the fairness of AI algorithms today don’t capture the critical nuance of intersectionality. Instead, they hold a binary view of fairness, e.g., protected vs. unprotected groups. In this talk, we’ll discuss the latest research on intersect… more
  • 0 comments
  • Submitted
  • 14 Jul 2021
Sandya Mannarswamy

Sandya Mannarswamy

Opening the NLP Blackbox - Analysis, Evaluation and Testing of NLP Models

Rapid progress in NLP Research has seen a swift translation to real world commercial deployment. While a number of success stories of NLP applications have emerged, failures of translating scientific progress in NLP to real-world software have also been considerable. Evaluation of NLP models is often limited to held out test set accuracy on a handful of datasets, and analysis of NLP models is oft… more
  • 1 comment
  • Submitted
  • 13 Jun 2021

Gaetan Castelein

Using feature stores to build a fraud model

Feature stores enable companies to make the difficult leap from research to production machine learning. At their best, feature stores allow you to define new features, automate the data pipelines to process feature values, and serve data for training and online inference. You can quickly and reliably serve features to your production models so your customers aren’t waiting for predictions. more
  • 2 comments
  • Submitted
  • 02 Jun 2021

shilpa shivapuram

Brands Dilemma: Personalization at the cost of privacy

We are in an era where we are so well connected virtually we are part of this humongous digital footprint that we are leaving behind. For eg when we buy anything from a marketplace, our app purchases, our entertainment preferences, and many more. These footprints are patterns of our behavior which could be private and public. Brands are hugely investing in this data to understand and cater to the… more
  • 2 comments
  • Submitted
  • 16 Apr 2021
Dipen Chawla

Dipen Chawla

Designing an Autonomous Workbench for Data Science on AWS

In the wake of the COVID-19 pandemic and the consequent remote work setup, we - the Engineering team at Episource - were keen on developing a hosted, self-serving platform which would allow our Data Science counterparts to access the compute and data they needed for their experiments on-the-fly. more
  • 2 comments
  • Submitted
  • 09 Jul 2021

Nitesh Garg

Data and Model versioning

https://docs.google.com/presentation/d/1qLRYcE00wnD83FgWxlLCoS_eFzaIL4yMGDOIhwJ1zNs/edit?usp=sharing more
  • 0 comments
  • Submitted
  • 12 Jul 2021

Abinaya Mahendiran

Building Human-in-the-loop pipeline in MLOps

The objective of this talk is to throw some light on how the productionized models can be improved iteratively by adopting Human-in-the-loop pipeline (Active learning strategy and human annotation) in an MLOps lifecycle. more
  • 1 comment
  • Submitted
  • 14 Jul 2021

Anithapraba Kathirvel

Taking ML models to production

In spite of massive ongoing improvements in machine learning and deep learning, a majority of data science teams still struggle to solve the last mile problem - taking models to production. Due to the ad-hoc nature of training iterations and lack of a standard process, the process of tracking experiments and deploying models is anything but a smooth one. Especially in the last decade, Machine Lea… more
  • 1 comment
  • Submitted
  • 13 Jul 2021

Aaditya Talwai

ML Governance from the Bottom-Up: Deriving Data Access Policy from Code through Ethical Monkey-Patching

Practical implementations of Data Governance tend to enforce access control at the datastore-level - think ACLs for S3, Snowflake or HDFS. But top-down enforcement of an infrastructure policy can be painful for the engineers working day-to-day with the data, especially in an ETL or Feature Engineering context. For example, critical data needed for extracting features can become obscured or even a… more
  • 3 comments
  • Submitted
  • 08 Jul 2021

Prakshi Yadav

Reducing technical debt for ML platforms

Deploying machine learning models at scale is a time-consuming process that involves many stages of simulations and stress testing. Continuous testing is needed to ensure that the engineers’ ML Models are performing as anticipated in production - especially monitoring data/model drift. What if the data scientists want to put their latest model enhancements to the test in a simulated near-producti… more
  • 0 comments
  • Submitted
  • 12 Jul 2021
Nir Barazida

Nir Barazida

Teamwork on a Machine Learning project that scales

A Machine Learning project is composed of a variety of artifacts that are distinguished from one another. When a project evolves and grows in complexity, this fact becomes a significant challenge in our workflow with multiple aspects, such as: more
  • 4 comments
  • Submitted
  • 09 Jul 2021

Sayed Adil Hussain

Search Driven Analytics: Enabled through a Conversational Bot

To deliver insights at the speed of thought, instead of requiring the need through dashboards, applying filters / or asking analysts - We at Mahindra & Mahindra, have developed Genie for Analytics, a voice-enabled analytics conversational assistant. This Engine has been a first-of-its-kind, cutting edge work in Mahindra. It integrates natural language processing, query processing and natural lang… more
  • 2 comments
  • Awaiting details
  • 06 Jul 2021

Srinivasa Rao Aravilli

MLOps patterns to address Machine Learning Models deterioration in production

In this talk, I will present Typical Life cycle of ML Models more
  • 4 comments
  • Submitted
  • 07 Jul 2021

Hosted by

Jump starting better data engineering and AI futures