Advancing multimodal and agentic AI: systems, storage & scalability
Open Source AI Meet-up - Bangalore edition
Day afterApr 2025
31 Mon
1 Tue
2 Wed
3 Thu
4 Fri 01:45 PM – 06:10 PM IST
5 Sat
6 Sun
Open Source AI Meet-up - Bangalore edition
Day afterApr 2025
31 Mon
1 Tue
2 Wed
3 Thu
4 Fri 01:45 PM – 06:10 PM IST
5 Sat
6 Sun
Submitted Mar 28, 2025
Preparing high-quality datasets is a critical yet time-consuming process when building large language models (LLMs). Data Prep Kit, an open-source Python toolkit, simplifies and automates data preparation tasks, enabling faster and more efficient workflows for LLM applications. This session will explore how Data Prep Kit addresses key challenges like text extraction, deduplication, and data quality scoring, along with insights from real-world use cases such as creating an exclusive RAG. Attendees will learn how to leverage the toolkit to streamline their data pipelines, enhance dataset quality, and maximize the efficiency of LLM development.
Here are some key resources to explore and understand the capabilities of Data Prep Kit and its applications in LLM development:
These resources provide technical depth, practical examples, and community-driven insights to help you fully leverage Data Prep Kit in your projects.
Vrunda Gadesha - AI Advocate | IBM
She is a Data Scientist, Ph.D. scholar, and AI enthusiast with expertise in Large Language Models, Natural Language Processing, Machine Learning, and technical content creation. Skilled in Python Programming, she has led AI solution development and shared her knowledge through academic writing and corporate training. She is passionate about advancing AI and data science and is committed to continuous learning and impactful innovation.
Day afterApr 2025
31 Mon
1 Tue
2 Wed
3 Thu
4 Fri 01:45 PM – 06:10 PM IST
5 Sat
6 Sun
Hosted by
Supported by
Meet-up sponsor
Community sponsor
Login to leave a comment
No comments posted yet