The Fifth Elephant OSAI meet-up - Hyderabad edition

The Fifth Elephant OSAI meet-up - Hyderabad edition

Call for Proposals - make a submission; give visibility to your work

Nashit Babber

@nashit93

Building Production-Ready RAG Systems: From Hugging Face Models to Self-Healing Pipelines

Submitted Sep 18, 2025

The excitement around open-source LLMs like Mistral, LLAMA, and models on Hugging Face has democratized AI development. However, the journey from downloading a model to deploying a production-ready Retrieval-Augmented Generation (RAG) system is filled with hidden challenges. This talk bridges that gap by sharing battle-tested strategies for building RAG systems that scale, self-monitor, and integrate seamlessly into your SDLC.

Drawing from real-world experience deploying AI systems serving millions of queries, I’ll demonstrate how to transform open-source models into production powerhouses. We’ll explore practical implementations using LangChain, vector databases, and Hugging Face models, showing how to build RAG pipelines that not only work but thrive in production. You’ll learn about implementing intelligent caching strategies, building self-healing mechanisms that reduce manual intervention by 80%, and creating feedback loops that continuously improve retrieval quality. The session includes live demos of deploying a Mistral-7B based RAG system with automatic failover and performance monitoring.

Key Takeaways:

  • Practical blueprint for productionizing open-source LLMs (Mistral, LLAMA) with RAG, including vector database selection, chunking strategies, and embedding optimization
  • Live demonstration of building self-monitoring RAG pipelines that detect and auto-correct retrieval failures using open-source tools

Target Audience:
ML engineers, backend developers, and DevOps engineers looking to deploy open-source AI models in production environments. Particularly beneficial for teams transitioning from POCs to production-grade AI systems.

About the Speaker:
Nashit Babber is a Lead Data Scientist with 10+ years of experience architecting scalable AI/ML solutions. He specializes in Generative AI, RAG systems, and has led the development of patent-pending AI models earning recognition with a World Economic Forum Lighthouse Award. An active open-source contributor, Nashit maintains projects including an English-to-Hindi translation model on Hugging Face and shares his expertise through his YouTube channel AITitBits, focusing on LLM architectures and production AI systems.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures

Supported by

Community sponsor