The Fifth Elephant 2025 Annual Conference CfP
Speak at The Fifth Elephant 2025 Annual Conference
Submitted May 6, 2025
Existing Retrieval Augmented Generation (RAG) based Q&A systems could only process textual information and are unable to answer from infographics (visual elements of information) such as tables, charts, images etc. in documents limiting the value and productivity.
Vision Language models encodes visual elements along with textual information which can be used for complex documents retrieval. However there are few challenges in scaling such as:
The talk presents an efficient state-of-the-art visual augmented search & question-answering system at scale by integrating vision embeddings with popular vector databases (OpenSearch, ElasticSearch, FAISS). The RAG based solution retrieves best matches, does late interaction re-ranking and utilizes multi-modal LM for generating exact answers. Our benchmarking results shows high performance accuracy in scalable setting.
Outline:
Takeaways:
Audience
Biography
I am Director, Data Science at Fidelity Investments with 12+ years of relevant experience in solving problems leveraging advanced analytics, machine learning and deep learning techniques. I started my career as a computer scientist in a government research organization (Bhabha Atomic Research Center) and did research on variety of domains such as conversational speech, satellite imagery and texts.
As part of my work, I have published and presented several research papers in multiple research conferences over years. I had an opportunity to be speaker in past 5th Elephant & PyCon conferences in past years. I had trained professionals in machine learning (M.Tech course) as Guest Faculty at BITS, Pilani, WILP program.
Slides
Coming soon.
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}