The Fifth Elephant 2023 Winter

The Fifth Elephant 2023 Winter

On the engineering and business implications of AI & ML

Tickets

Loading…

Simrat Hanspal

Let's build production-ready RAG pipeline the right way!

Submitted Nov 15, 2023

Retrieval Augmented Generation (RAG) is a brilliant technique of augmenting knowledge to Large Language Models (LLMs) so that we can use the power of Language Models for multiple use cases on enterprise and real-time data.

But, like every coin has two sides, the RAG technique comes with its own set of challenges. Firstly, building secure data APIs is time-consuming; on top of it, LLMs resolving incoming queries to data models makes it hard to have a highly reliable and deterministic pipeline. Businesses can’t scale without determinism. Imagine a nondeterministic flight management system that schedules flights at an incorrect time once in a while. We would end up adding multiple validation steps, and scaling would become highly expensive.

In most RAG setups, we fetch information and leave the decision-making to LLM to decide whether there is information present in the context to answer the question. LLM tasks are defined to execute a specific task like information extraction, summarization, etc, and not to do data validations. Even if we defined LLMs to do the data validation, this wouldn’t be the right thing to do as we would be compounding the error of wrong context with LLM hallucinations. We need the data checks for valid context to be at the data source.

In this talk, I will present the challenges of building production-ready RAG and how Hasura, with the understanding of your data model, can help you build a highly deterministic RAG pipeline by cutting down noise.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid access (members only)

Hosted by

All about data science and machine learning

Supported by

Sponsor

Providing all founders, at any stage, with free resources to build a successful startup.