A brief talk on Managed Feature Store built on top of Open Source Feast. We will start with a brief walkthrough of the open source Feast feature store including the architecture and core capabilities. We would call out some of the challenges/limitations of the open source Feast feature store. We would then describe some of the enhancements which enables us to have a more robust, secure and scalable deployment by using a) managed resources on Cloud platforms for eg, Kafka vs Event Hub (Azure), open source Spark vs Databricks; b) Integration of RBAC & Table Level Access Control to maintain controlled usage c) Scalable batch ingestion by using Spark instead of Pandas & addition of new capabilities to increase data reusability.
Speakers:
Dr Mohit Kumar (Head - Data Science, Product Analytics and Data Platform)
Sai Sharan Tangeda (Data Scientist)
Time: 30 mins
- Introduction
- Introduction
- Motivation for maintaining a Managed Feature Store
- Feast (Open Source): Constructs, Core Capabilities & Limitation
- Constructs & Architecture of Feast
- Point In Time Join Capabilities with Batch Retrieval
- Batch Ingestion into Historical Store & Scale Limitations
- Streaming Capabilities with Apache Kafka & Redis
- Reliability issues with self deployed resources like Kafka, Redis, PostgreSQL
- Managed Feature Store as a fork of Feast
- Overview of Core Architecture
- Integration of Azure Eventhubs as a replacement for Apache Kafka
- Introducing Databricks as Spark Backend
- Ensuring Scalability for large data sizes via Spark
- RBAC & Table Level Access Control for controlled usage
- End-to-End flow for real-time model serving
- Closing Arguments
- Increase in Productivity with ready-to-use Features
Link to slides: https://drive.google.com/file/d/1ocJNDbEUxXVJqyBVD35k-Vvr1y8hjN5k/view?usp=sharing
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}