Transforming Document Curation: LMs and vector databases at scale

Aug 2023

7 Mon

8 Tue

9 Wed

10 Thu

11 Fri 09:00 AM – 06:00 PM IST

12 Sat

13 Sun

Bangalore International Centre (BIC), Bengaluru

Tickets

All submissions

Previous Next

Transforming Document Curation: LMs and vector databases at scale

Submitted Jun 30, 2023

Abstract

Auquan is an AI startup that serves institutional investors and investment managers with curated news and documents to help them make better investment decisions.

In this presentation, I will discuss our approach for using a vector database and a tuned language model to curate news items at scale.

I will walk through the process and pitfalls for using these technologies in production, and provide best practices for achieving high performance. Specifically, I will discuss the metrics that we used for selecting a vector database and tune our language model.

Audience

ML Engineers, early stage Data Scientists

Takeaways

How to use a vector database and a language model to curate news items at scale
Best practices for using vector databases in production
Pitfalls to avoid when using vector databases

Presentation Outline:

Introduction
- About Auquan
- Problem description
Vector databases for news curation
- Choosing a vector database
- Using embeddings with vector databases for different tasks
- Offline population and real time inference
Tuning a language model for news curation
- Tuning an LM
- Using tuned model for embedding
Stack architecture
Conclusion/QA

Link to the Presentation

https://drive.google.com/file/d/1FubDUOYBJhzgaW-H6qxrlkEO49d3abDr/view?usp=sharing

All submissions

Previous Next

Comments

Aug 2023

7 Mon

8 Tue

9 Wed

10 Thu

11 Fri 09:00 AM – 06:00 PM IST

12 Sat

13 Sun

Hybrid access (members only)

Hosted by

The Fifth Elephant

Jumpstart better data engineering and AI futures

Supported by

LlamaIndex

E2E Networks Limited

E2E Cloud is India's first AI hyper scaler, a cloud computing platform providing accelerated cloud-based solutions at maximum optimization and lowest pricing

The Fifth Elephant 2023 Monsoon

Transforming Document Curation: LMs and vector databases at scale

Abstract

Audience

Takeaways

Presentation Outline:

Link to the Presentation

Comments