With the recent technological advancement in LLM’s, embedding generation and Retrieval Augmented Generation(RAG), there is immense interest in using these technologies to solve problems involving Semantic Search, Chat Bots, Code Graph, Knowlegde Graphs etc.

This has led to the development of Data Management systems which can deal with large scale generation, storage and search capability for these vector embeddings.

This Birds of Feather (BOF) session was held at The Fifth Elephant 2024 conference on 13th July. The BOF was targeted towards a technical audience who were either using Vector Databases in production, or were exploring the possibility of adding VectorDBs to their tech stack.

Outline

What are vector databases? Why should one care?

What problems are solved by Vector DBs over traditional databases?
Brief overview about RAG and how Vector DBs fit into the picture.
Compare with a traditional database based on architecture and based on problems solved.

How are vector DBs built? How do they work under the hood?

Brief, high-level overview about the inner workings and considersations made while building a Vector Database

What are the main Vector Databases out there in the market? How does one decide which one to go with?

Compare with popular DBs. Explore their pros & cons.
Which DBs are good for which speicifc use cases?
Thoughts of going with something like pg_vector which have full text search search primarily with an additional option to add in vector search.

Performance comparison with some metrics

Ingestion speed, query speed, Recall.

Any drawbacks with VectorDBs in production

What are the concerns one might have when using these in production?
Any specific caveats one should keep in mind while using VectorDBs. E.g. access control.

Discussion summary and key takeaways for audience

Recall vs. efficiency trade-off: In the Space Partitioning based Vector indexes achieving very high recall (>95%) in vector search often requires probing additional clusters, leading to diminishing returns in terms of computational resources versus recall improvement.
Quantization for memory optimization: Binary quantization techniques can significantly reduce memory usage while maintaining acceptable performance. For example, 36 million Wikipedia paragraphs were stored in just 2GB of RAM using binary quantization.
Hybrid search strategies: Combining traditional database filtering with vector search can improve query performance. Filtering on regular data before performing vector search reduces the number of vectors distance calculation if the selectivity is high.
Challenges in index updates: Updating large vector indexes while maintaining system performance and recall is a challenge. Updates to index are CPU intensive especially for graph-based indexes and as updates happen maintaining quality of index is also a challenge. Mirroring and rebuilding on separate machines before switching was suggested as a solution for major updates.
Embedding quality is crucial: The effectiveness of vector search heavily depends on the quality of the underlying embedding model. Techniques like PCA can be used to visualize embeddings and check for effective clustering.
Sequential vs. random access: For certain data sizes, sequential scans can be more efficient than random reads. This is particularly relevant when the data doesn’t fit entirely in memory and graph-based approaches lead to random access patterns.

Moderator and speakers at the BOF

Aldrin Jenson moderated the discussion. He is Product Engineer at Athena Intelligence. Aldrin was part of the Indic Subtitler team which was shortlisted for Hack5 2024 edition Demo Day competition.
Rajkumar Iyer works on the effort to build Native Vector Database support in Microsoft SQL Server.
Chaitanya Chokkareddy is co-founder at Ozonetel.
Aditi Ahuja is SWE-2 at Couchbase. She is part of the organizing team at Bengaluru Systems meet-up.
Sumesh Meppadath is Product Manager at GalaxEye.

All submissions

Previous Next

Comments

Jul 2024

8 Mon

9 Tue

10 Wed

11 Thu

12 Fri

13 Sat 09:00 AM – 06:05 PM IST

14 Sun

Hosted by

The Fifth Elephant

Jump starting better data engineering and AI futures

Supported by

Gold Sponsor

Atlassian

Atlassian unleashes the potential of every team. Our agile & DevOps, IT service management and work management software helps teams organize, discuss, and compl

Silver Sponsor

Google

Together, we can build for everyone.

Workshop sponsor

Datastax

Datastax, the real-time AI Company.

Lanyard Sponsor

Uber

We reimagine the way the world moves for the better.

Sponsor

Monster API

MonsterAPI is an easy and cost-effective GenAI computing platform designed for developers to quickly fine-tune, evaluate and deploy LLMs for businesses.

Community Partner

FOSS United Foundation

FOSS United is a non-profit foundation that aims at promoting and strengthening the Free and Open Source Software (FOSS) ecosystem in India. more

Beverage Partner

BONOMI

BONOMI is a ready to drink beverage brand based out of Bangalore. Our first segment into the beverage category is ready to drink cold brew coffee.