The Fifth Elephant 2025 Annual Conference CfP

The Fifth Elephant 2025 Annual Conference CfP

Speak at The Fifth Elephant 2025 Annual Conference

Karthika Vijayan

Karthika Vijayan

@karthikav

Enhancing retrieval in RAG: The fused features way

Submitted May 13, 2025

Retrieval-Augmented Generation (RAG) has emerged as a dominant framework for leveraging Large Language Models (LLMs) to generate responses grounded in extensive textual corpora. Within this architecture, the retrieval component plays a critical role in determining the overall system accuracy by surfacing the most relevant text chunks based on semantic similarity to the user query. Typically, this similarity is computed via cosine scores between vector embeddings of the query and document segments. This talk will highlight the limitations of conventional retrieval methods and motivate the need for more expressive and effective embedding strategies.

We will look into both sparse and dense embeddings, and how each captures different aspects of meanings from text. The talk will focus on how combining these embeddings can give a more complete representation of queries and documents. I will explain simple yet powerful techniques to fuse multiple embeddings and show how this improves retrieval results. Through practical examples and empirical insights, the session will demonstrate how such fusion techniques significantly outperform single-embedding baselines in RAG pipelines.

Outline of the talk

  • Introduction to embeddings as effective representation of text
  • What are sparse and dense embeddings and what do they really represent
  • Ways to combine multiple embeddings to form fused or composite features
  • Retrieval scores in RAG, showcasing the effectiveness of fused features

Takeaways

  • Learn about underlying mechanisms of text embeddings
  • Learn cool ways to make features from text; simple refinements that make huge difference in RAG
  • Listen to nuances of some original work that I have done for in-house projects

If you’re a Gen AI enthusiast, going to build many many RAG systems and have curiosity around LLMs, this talk is for you!

Speaker bio
Dr. Karthika Vijayan is a Solution Consultant at Sahaj Software. She has been conducting research in the field of conversational AI with voice and text data for almost a decade. Her research has been published in several journals and presented at various international conferences. Prior to joining Sahaj Software, she worked as a research fellow at the National University of Singapore and at IISc Bangalore. She has done her PhD from IIT Hyderabad.

Previous talk links
https://www.youtube.com/watch?v=o6YHcDLod8A
https://www.youtube.com/watch?v=-uoUwGpzIL0
https://www.youtube.com/watch?v=kphYc_lvKIk&list=PLkPaq00oPRfzz9O4q06rOL2dHCEX7PQwU&index=18
https://www.youtube.com/watch?v=gvJhtBdmUi8&t=897s

Profile links:
https://scholar.google.com/citations?user=fJp6O0UAAAAJ&hl=en
https://www.linkedin.com/in/karthika-vijayan/
https://www.researchgate.net/profile/Karthika-Vijayan

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures