MLOps Conference

MLOps Conference

On DataOps, productionizing ML models, and running experiments at scale.

Tickets

Loading…

Sai Sharan Tangeda

Sai Sharan Tangeda

@sai_sharan_t

Managed Feature Store: Improving data reusability & Providing a means for low latency real-time prediction at Udaan

Submitted Jul 11, 2021

A brief talk on Managed Feature Store built on top of Open Source Feast. We will start with a brief walkthrough of the open source Feast feature store including the architecture and core capabilities. We would call out some of the challenges/limitations of the open source Feast feature store. We would then describe some of the enhancements which enables us to have a more robust, secure and scalable deployment by using a) managed resources on Cloud platforms for eg, Kafka vs Event Hub (Azure), open source Spark vs Databricks; b) Integration of RBAC & Table Level Access Control to maintain controlled usage c) Scalable batch ingestion by using Spark instead of Pandas & addition of new capabilities to increase data reusability.

Speakers:
Dr Mohit Kumar (Head - Data Science, Product Analytics and Data Platform)
Sai Sharan Tangeda (Data Scientist)
Time: 30 mins

Agenda

  1. Introduction
    1. Introduction
    2. Motivation for maintaining a Managed Feature Store
  2. Feast (Open Source): Constructs, Core Capabilities & Limitation
    1. Constructs & Architecture of Feast
    2. Point In Time Join Capabilities with Batch Retrieval
    3. Batch Ingestion into Historical Store & Scale Limitations
    4. Streaming Capabilities with Apache Kafka & Redis
    5. Reliability issues with self deployed resources like Kafka, Redis, PostgreSQL
  3. Managed Feature Store as a fork of Feast
    1. Overview of Core Architecture
    2. Integration of Azure Eventhubs as a replacement for Apache Kafka
    3. Introducing Databricks as Spark Backend
    4. Ensuring Scalability for large data sizes via Spark
    5. RBAC & Table Level Access Control for controlled usage
    6. End-to-End flow for real-time model serving
  4. Closing Arguments
    1. Increase in Productivity with ready-to-use Features

Link to slides: https://drive.google.com/file/d/1ocJNDbEUxXVJqyBVD35k-Vvr1y8hjN5k/view?usp=sharing

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid access (members only)

Hosted by

All about data science and machine learning

Supported by

Scribble Data builds feature stores for data science teams that are serious about putting models (ML, or even sub-ML) into production. The ability to systematically transform data is the single biggest determinant of how well these models do. Scribble Data streamlines the feature engineering proces… more

Promoted

Deep dives into privacy and security, and understanding needs of the Indian tech ecosystem through guides, research, collaboration, events and conferences. Sponsors: Privacy Mode’s programmes are sponsored by: more