The Fifth Elephant 2024 Annual Conference (12th &13th July)
Maximising the Potential of Data — Discussions around data science, machine learning & AI
Jul 2024
8 Mon
9 Tue
10 Wed
11 Thu
12 Fri
13 Sat 09:00 AM – 06:05 PM IST
14 Sun
Abhijeet Kumar
This talk will take the audience through our experience from building a content generation solution for data catalog enrichment effort from modeling perspective (RAG based pre-trained model & RAG based FineTuned model).
For this use-case, I will talk about the approach taken to
I will talk about finetuning details suing LORA technique and will also compare the results from 3 models namely few-shot pretrained Llama2-13B, few-shot finetuned Llama2-7B and GPT-3.5 turbo.
The talk will draw various insights about behaviour of these models in terms of content generation. This also includes accuracy (with ground truth), alignment (factual consistency with prompt inputs) and toxicity detection.
Enterprise Data Catalog is a large effort in any enterprise to keep curated meta-data about the data for the user reference. This includes majorly writing descriptions about the tables and its columns for business consumption. This has always been a manual effort.
Here, we are talking about hundreds of database schemas, thousands of database tables and millions of columns in data-catalog. Often curated content is merely 3-5%. The objective is to enrich the data catalog using AI solution.
This talk is intended for data enginners, data scientists or researchers in GenAI space and wants to understand model behaviour in different construct (RAG, Finetuning etc).
This talk is intended for data leaders, data stewards, data SMEs who are closer to enterprise data. The topic might interests as an initiative to enrich meta-data of data catalogs for enterprises.
In general, The talk will align to any professional working on Gen AI usecase with python.
Hosted by
Supported by
Gold Sponsor
Sponsor
Community Partner
Beverage Partner
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}