The Fifth Elephant 2023 Winter
On the engineering and business implications of AI & ML
Dec 2023
4 Mon
5 Tue
6 Wed
7 Thu
8 Fri 09:00 AM – 04:15 PM IST
9 Sat
10 Sun
Retrieval Augmented Generation (RAG) is a brilliant technique of augmenting knowledge to Large Language Models (LLMs) so that we can use the power of Language Models for multiple use cases on enterprise and real-time data.
But, like every coin has two sides, the RAG technique comes with its own set of challenges. Firstly, building secure data APIs is time-consuming; on top of it, LLMs resolving incoming queries to data models makes it hard to have a highly reliable and deterministic pipeline. Businesses can’t scale without determinism. Imagine a nondeterministic flight management system that schedules flights at an incorrect time once in a while. We would end up adding multiple validation steps, and scaling would become highly expensive.
In most RAG setups, we fetch information and leave the decision-making to LLM to decide whether there is information present in the context to answer the question. LLM tasks are defined to execute a specific task like information extraction, summarization, etc, and not to do data validations. Even if we defined LLMs to do the data validation, this wouldn’t be the right thing to do as we would be compounding the error of wrong context with LLM hallucinations. We need the data checks for valid context to be at the data source.
In this talk, I will present the challenges of building production-ready RAG and how Hasura, with the understanding of your data model, can help you build a highly deterministic RAG pipeline by cutting down noise.
Hosted by
Supported by
Sponsor
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}