Submissions

The Fifth Elephant

The Fifth Elephant 2024 Annual Conference (12th &13th July)

Maximising the Potential of Data — Discussions around data science, machine learning & AI

Jul 2024

8 Mon

9 Tue

10 Wed

11 Thu

12 Fri

13 Sat 09:00 AM – 06:05 PM IST

14 Sun

Bangalore International Centre, Bangalore

Accepting submissions

Not accepting submissions

General guidelines for conference submissions We appreciate that many participants create submissions out of a genuine desire to share knowledge with our community, help solve common problems or to contribute in a meaningful way. However, we find that they sometimes fall short of achieving this obj… expand

General guidelines for conference submissions

We appreciate that many participants create submissions out of a genuine desire to share knowledge with our community, help solve common problems or to contribute in a meaningful way. However, we find that they sometimes fall short of achieving this objective because the written submissions fail to capture the attention of the community or meet acceptance through The Fifth Elephant’s peer review process. More often than not this is because the content of the submission does not explain what they intend with sufficient clarity or detail.

The template (and example) is an attempt to help you write a better submission, one that is noticed and understood by your intended audience and not lost in the crowd of interesting proposals we receive. Please use this template as a guideline, while ensuring that it is in your own unique and authentic voice.

BEFORE you begin writing your submission, please give some thought to the following:

Who is the audience for your session? Think about their interests, work roles, challenges, age or experience as you decide this.
What problem/pain are you trying to solve (for the audience)? This should be something that is communicated clearly so that they have a sense of your session’s importance.
What will be the scope of your session? This will help identify the central topic or theme and should describe broad areas you plan to cover during the session?
How will participants benefit from your session? Think of practical and specific ways in which they will be able to apply the knowledge they gain, and beyond just general awareness.
What is the appropriate format for your session, given the audience and objectives that you have in mind?

The most successful talks and sessions are those where presenters are able to abstract an actionable insight from a common pain area, enlighten the audience about something new, provide a fresh perspective, and/or demonstrate innovation.

Here’s a guide for speakers to draft their presentations.

You can view talks held at previous editions of The Fifth Elephant 2024 for reference:

Winter edition 2023 - http://has.gy/z_vu
Monsoon edition 2023 - http://has.gy/1ZAa
MLOps conference 2021 - http://has.gy/uefR

The call for submissions will be close on 3 June 2024. Talks will be selected on a rolling basis as submissions are made.

Topics for submitting talks

Data engineering - data pipelines and dataset creation for AI; LLM ops; managing NLP pipelines.
Best practices for LLM training, inference, deployment; LLM and security - best practices on security while incorporating LLMs and SLMS in organizations; working with Open Source LLM models; security, bias and risk mitigation.
GenAI - Generative AI based use-cases, products, platforms and research such as multi-lingual models, use-cases where GenAI is being used.

Types of submissions

You can submit a session for:

30 mins talk
15 mins talk
Demo and startup showcases
Birds of Feather (BOF) sessions
Hands-on workshops for three hours duration

Make a submission

GraphRAG: Powering Up LLMs with Knowledge Graphs

In the era of big data, large language models (LLMs) are becoming increasingly important for tasks like question answering, document analysis, and chatbot development. However, traditional LLMs can often struggle with factual accuracy, reasoning, and handling complex information. more

4 comments
Confirmed & scheduled
13 Apr 2024

Session type: 30 mins talk

Scaling Customer Delight at Zomato using AI

Introduction In today’s rapidly evolving digital landscape, Generative AI is playing a pivotal role in transforming how our businesses interact with the customers. more

3 comments
Confirmed & scheduled
15 May 2024

Session type: 30 mins talk

Chat with Tables: Query tabular data in English using self-hosted Large Language Models

Business users and non-technical professionals often need to quickly analyse or transform tabular data in spreadsheets for ad hoc business intelligence. However, they might lack the necessary programming knowledge to do so themselves and therefore must reach out to a data analyst. Such unexpected delays have the potential to incur huge opportunity costs for time-sensitive business decisions which… more

0 comments
Confirmed
15 May 2024

Session type: Workshop

A privacy preserving DPI to unfreeze data markets - to solve our data woes!

Clearly, there is a race to build larger and larger AI models these days trained on as much training data as possible. Indian developers are also trying to make their presence felt in this race to build home-grown models for our unique problems and situations. Owing to our large population, we probably generate more data than any other country. This sounds great, right, but much of this Indian co… more

2 comments
Confirmed & scheduled
23 May 2024

Session type: 30 mins talk

AI--By the People For the People

Introduction Like steam engine, Electricity and Internet became integral to the first, second and third Industrial revolutions, AI is going to be adapted sooner than later in all production and business processes. It is very rapidly going to change the way people conduct their businesses and how the production processes are executed. While the Free/open and the proprietary/closed nature of the so… more

1 comment
Confirmed & scheduled
23 May 2024

Session type: 30 mins talk

Llama.lisp: design of an AI first compiler framework

Abstract: Compilers are workhorses of performance behind all AI algorithms. Making algorithms work effectively on GPUs is especially hard - called kernel programming. Compiler ecosystem around GPUs is especially messed up. Compilers are supposed to allow for performance portability of different hardwares but this is usually not the case. See below infographic for current state of AI compilers. more

0 comments
Confirmed & scheduled
29 May 2024

Session type: 30 mins talk

Fine-Tuning LLMs for Script-Writing: A Journey into the world of open source LLMs

Explore the emerging technology of open-source Large Language Models (LLMs) with a hands-on tutorial where we will fine-tune an LLM to build a script-writing assistant for a popular daytime soap. This session delves into the fundamentals of LLMs, including pre-training and fine-tuning with LORA (Low-Rank Adaptation) or Direct Preference Optimization (DPO), offering a good understanding of these n… more

0 comments
Confirmed
25 May 2024

Session type: Workshop

A new approach to building high-performance lakehouse compute engines for open table formats like Delta lake, Apache Iceberg, and Apache Hudi

Introduction Platform engineering and data architecture teams are increasingly adopting object-store backed data lakehouses as their central, unified platform for workloads across Analytics as well as AI. more

1 comment
Confirmed & scheduled
15 May 2024

Session type: 30 mins talk

Build a Data Product - A Roundtable Discussion

Build a Data Product - A Roundtable Discussion Outline more

0 comments
Confirmed & scheduled
31 May 2024

Session type: Workshop

Samvaadini - a telecalling voice bot for blue-collar hiring in India

In this talk, we introduce ‘Samvaadini’, a telecalling voice bot designed specifically for hiring blue-collar workers in India. Samvaadini leverages state-of-the-art AI to capture human responses adeptly, autonomously determine the next action, and respond in a voice indistinguishable from a human. more

3 comments
Submitted
06 May 2024

Session type: Demo - showcase of your work

AI-GOLD: Identifying Billion-Dollar AI Use Cases

It is a well-known fact that most (Gen)AI projects fail to deliver any return on investment (ROI) [1]. The reasons behind this are multifaceted. One fundamental reason is the pursuit of suboptimal, and at times entirely inappropriate, use cases. more

1 comment
Submitted
05 May 2024

Session type: 30 mins talk

Product Management for AI-first products

** In depth & exclusive content to help seasoned product managers transition into AI first world with tons of case studies and examples** more

2 comments
Submitted
05 May 2024

Session type: Workshop

AI Product management: paradigm shift or old wine in new bottle?

In this talk, we discuss why Traditional Software Product Management skills fall drastically short when building (Gen)AI Products. Why Product Management for (Gen)AI Products Requires Major Upgrade more

1 comment
Submitted
06 May 2024

Session type: 30 mins talk

Designing CoPilot: An AI-Driven Approach to Conversational Email Marketing Campaigns

In today’s digital marketing landscape, the demand for personalized and effective email campaigns is ever-growing. Our talk dives into this challenge, presenting an innovative solution through the creation of CoPilot, an AI-driven system tailored for Freshmarketer. more

3 comments
Awaiting details
13 May 2024

Session type: 30 mins talk

Revolutionizing D2C Marketing: Empowering with Product Recommendation Framework

Freshmarketer empowers D2C store owners with data-driven marketing solutions tailored to their unique needs. Our approach integrates product recommendation systems into marketing campaigns, addressing various marketing objectives. Unlike one-size-fits-all solutions, Freshmarketer develops decision engines customized to individual stores, ensuring flexibility across different categories and campai… more

7 comments
Awaiting details
13 May 2024

Session type: 30 mins talk

Unlock Data with NL2SQL: Building Low-code Data Assistant for Business using Code LLMs

This BoF session talks about building low-code data assistant for business using code LLMs. Generative AI (LLMs) for codes had become very popular and powerful tool for developers to leverage with rise of enterprise solution like GitHub Copilot, AWS Code Whisperers, Google Duet etc along with numerous open source code assist models for generating codes for hundreds of programming languages includ… more

2 comments
Confirmed & scheduled
15 May 2024

Session type: Birds of Feather (BOF) session

Leveraging the Power of Log Clustering Algorithms to Reduce Alert Noise in IT Operations

The ever-increasing volume of alerts generated by monitoring tools poses a significant challenge for IT Operations teams. A substantial portion of these alerts are duplicates or false positives, overwhelming ITOps practitioners and hindering the timely identification of critical issues. Traditional methods for managing alert floods, such as manual filtering, are ineffective, prone to human error,… more

7 comments
Submitted
22 May 2024

Session type: 30 mins talk

The Multimodal Revolution: Reshaping Video Analysis Pipelines

Multimodal AI is revolutionizing video analysis, but practical insights on pipeline design are scarce. Traditional computer vision pipelines often involve a complex web of specialized models. This leads to high costs, maintenance burdens, and difficulty in adapting to new tasks. This talk will dissect a real-world case study where multimodal models dramatically simplified a large-scale video anal… more

0 comments
Submitted
23 May 2024

Session type: 30 mins talk

Enterprise-Ready Data Lifecycle: Powering AI & Analytics at scale

In this session, we discuss Atlassian data architecture to help demystify the complexities around building a real-world scalable Delta Lakehouse meeting data governance and compliance requirements and how we enabled various teams to iterate fast for their data-driven initiatives. more

6 comments
Confirmed & scheduled
27 May 2024

Session type: Birds of Feather (BOF) session

Advancing TB Screening: Integrating Vision Language Models and Patient Metadata

Problem TB claims over 1.3 million lives annually, with around 30% of cases missed by current screenings and diagnostics. The shortage of radiologists further complicates timely and accurate TB screenings, often relying on subjective interpretations that can lead to missed diagnoses or unnecessary treatments, impacting patient’s health. There is a critical need for accurate detection and differen… more

4 comments
Submitted
30 May 2024

Session type: 30 mins talk

Vector databases Birds of Feather (BOF) session

Background With the recent technological advancement in LLM’s, embedding generation and Retrieval Augmented Generation(RAG), there is immense interest in using these technologies to solve problems involving Semantic Search, Chat Bots, Code Graph, Knowlegde Graphs etc. more

6 comments
Confirmed & scheduled
31 May 2024

Session type: 30 mins talk

Unified Help in Jira Service Management using AI

Introduction Atlassian’s Jira Service Management (JSM) has consistently strived to empower customers by delivering top-notch assistance to those in need. A primary objective has been to promote self-service within JSM, allowing users to promptly access help while reducing the workload on agents. more

1 comment
Confirmed & scheduled
31 May 2024

Session type: 30 mins talk

Nested Evolution and Schema Transformation (NEST) Framework for Managing Schema Evolution in Spark

Overview The NEST Framework automates the handling of dynamic and nested schemas, making it easier for developers to manage schema changes and maintain accurate, deduplicated tables in Spark. We are excited to present this innovative solution at the Data Engineering Conference. more

2 comments
Submitted
31 May 2024

Session type: 30 mins talk

Need for new licenses in this age of Generative AI

Table of Contents Introduction The disruptive nature of AI technology more

1 comment
Confirmed & scheduled
01 Jun 2024

Session type: Birds of Feather (BOF) session

Unifying Senses: The Evolution, Technology, and Impact of Multimodal Fusion

Multimodal fusion has revolutionized the way we integrate and interpret diverse data sources, creating powerful insights from the synergy of visual, auditory, and textual information. In this talk, titled “Unifying Senses: The Evolution, Technology, and Impact of Multimodal Fusion,” we will explore the origins of multimodal fusion and trace its development over the years. We’ll delve into how thi… more

1 comment
Confirmed & scheduled
02 Jun 2024

Session type: 30 mins talk

Intent Prediction in Search at Myntra

Myntra is one of India’s leading fashion e-commerce companies, delivering a best-in-class shopping experience through advanced machine learning models. This session will delve into a key machine learning solution designed to enhance query understanding for product search flow. Our ML model accurately interprets user intent from all types of search queries, helping shoppers find exactly what they … more

3 comments
Submitted
03 Jun 2024

Session type: 30 mins talk

Vector Databases: A Bird's Eye View

This talk is focused on an equipping the audience with an overall understanding of the current vector database landscape, and how vector databases work internally with a focus on a few common algorithms. more

2 comments
Confirmed & scheduled
03 Jun 2024

Session type: 30 mins talk

Triton, the hard way!

Abstract A lot of engineers are interested in using LLMs nowadays. However, its efficient execution remains a challenge. Efficient execution is key to mainstream adoption. To run them efficiently, we need accelerated systems such as GPU. This talk will explore the fundamentals of GPU architecture and its programming model, moving beyond model.to('cuda') to understand the inner workings of GPUs. A… more

0 comments
Confirmed
03 Jun 2024

Session type: Workshop

Ephemeral data pipelines using Atlassian’s Lithium platform

There are numerous use cases that require moving large amounts of data between different systems and validating and transforming them in-flight. Platforms such as Apache Flink can be excellent choices for moving and transforming data at scale - effectively through streaming ETL. However, certain use cases within Atlassian ‘onprem to cloud data migration’, ‘cloud to cloud data migration’, ‘backup … more

1 comment
Confirmed & scheduled
03 Jun 2024

Session type: 30 mins talk

Democratizing AI: Harnessing Decentralized GPUs for AI Model Fine-Tuning and Deployment

Abstract: This session delves into the complexities involved in building a scalable, decentralized GPU cloud tailored for the efficient training, fine-tuning and deployment of AI models, more specifically large language models (LLMs). We will explore the significant technical hurdles our team overcame, including ensuring cost-effectiveness, optionality and accessibility of GPU resources. This inf… more

1 comment
Submitted
03 Jun 2024

Session type: 30 mins talk

Apache XTable (Incubating): Interoperability across table formats

Apache Hudi, Delta Lake, and Iceberg are leading open-source projects that offer decoupled storage with transactional and metadata layers, known as table formats in cloud storage. These formats store data in open columnar formats like Parquet and include metadata for schema, commit history, partitions, and column statistics. Selecting a table format can be challenging due to the unique features o… more

1 comment
Confirmed & scheduled
03 Jun 2024

Session type: 30 mins talk

Jira cloud data extraction @ scale

Cloud data extraction is a subset of the broader data engineering field that involves the process of retrieving or pulling data from cloud-based applications and services for analysis, reporting, or storage in a centralized data repository. Atlassian’s data extraction solution has evolved significantly over the years to meet the demands of enterprise-grade customers. Initially started with full t… more

9 comments
Submitted
03 Jun 2024

Session type: 30 mins talk

Improving search relevance in hyperlocal food delivery using (small) language models

Introduction The ability to accurately understand and serve customer search queries is critical to Swiggy. This need is amplified in food delivery platforms operating in India due to the wide variety of languages, cuisines and tastes. Our platform alone offers millions of items from hundreds of thousands of restaurants across India. Not only do Indian dish names have a tremendous amount of region… more

2 comments
Submitted
03 Jun 2024

Session type: 30 mins talk

Deviations from the norm - anomaly detection with PerceptInsight

Abstract In a metric driven digital world not only is observability important but being able to understand anomalies in data streams and being able to do correlations adds significant advantages to organisations. In this talk I discuss how how we went about building this with PerceptInsight processing over 500 million events/day and how different organisations are leveraging it to their benefit. more

3 comments
Submitted
05 Jun 2024

Session type: 30 mins talk

Design Patterns for Data Masking and Tokenization

Outline In the era of big data, ensuring the privacy and security of sensitive information is more crucial than ever. more

0 comments
Submitted
05 Jun 2024

Session type: 30 mins talk

AI and Risk Mitigation Strategies in Key Indian Sectors

Abstract: As AI continues to revolutionize various sectors, it brings both unprecedented opportunities and significant risks. In India, sectors such as Agritech, Fintech, Edtech, public services, and Healthtech are rapidly adopting AI technologies. However, the lack of robust risk mitigation strategies can lead to unintended consequences, including data breaches, algorithmic biases, and systemic … more

0 comments
Confirmed & scheduled
07 Jun 2024

Session type: Birds of Feather (BOF) session

Book Discussion on Dream Machine: A Graphic Novel about AI

Join us for a candid chat with the artist Appupen, whose recently released graphic novel Dream Machine explores the implications of unleashing AI to the real world. Through the narrative centered around Hugo — an entrepreneur who dreams of being a superhero — the novel uncovers some crucial concerns around implementing AI at scale such as Bias, Surveillance, Ethics, Trust & Creativity. more

0 comments
Confirmed & scheduled
08 Jun 2024

Session type: Birds of Feather (BOF) session

Securing big data environments

This BoF is about securing big data environments and learning about different controls from the security, data privacy, and compliance side. How to balance security, scale, and user experiences while scaling big data environment. There would be discussion around certain use cases and edge cases that data platform team should be aware of while implementing certain security controls more

0 comments
Confirmed & scheduled
10 Jun 2024

Session type: Birds of Feather (BOF) session

Solving the Data Platform Puzzle: Observability Meets Cost Optimization

Outline This session is aimed at data platform engineers, data architects, and engineering leaders who are looking to significantly reduce costs while maintaining or improving platform performance and reliability. The content will be tailored to those with a strong technical background who are facing challenges around optimizing complex data pipelines and infrastructure. more

4 comments
Submitted
10 Jun 2024

Session type: 30 mins talk

Ensuring Data Quality with Data Contracts and OpenLineage

Abstract In the modern data landscape, ensuring data quality and integrity is paramount. This conference will explore the concept of Data Contracts as a schema registry, incorporating data quality (DQ) checks and leveraging OpenLineage to capture compliance failures. By implementing Data Contracts, organizations can enforce strict data quality standards and track lineage to understand the impact … more

1 comment
Confirmed
10 Jun 2024

Session type: Workshop

RAG Vs Fine-Tuning: Implementation Anecdotes from Data Catalog Enrichment Solution

Abstract This talk will take the audience through our experience from building a content generation solution for data catalog enrichment effort from modeling perspective (RAG based pre-trained model & RAG based FineTuned model). more

0 comments
Submitted
12 Jun 2024

Session type: 30 mins talk

Digital Twin for Retail Shelf Optimization using Advanced ML and GenAI at AB InBev (world’s largest beer company)

Outline AB InBev sells a significant share of its beer volume through retailers who take AB InBev’s assistance to configure the best shelf assortment that would maximize their revenues. Planograming is a very important step that retailers perform to allocate their available shelf space to the right products. For large retailers that have 1000s of stores, this becomes a very time consuming and ted… more

2 comments
Submitted
13 Jun 2024

Session type: 30 mins talk

Building an AI Data Analyst

LLMs have transformed data analysis. With their ability to generate code to analyse data given appropriate prompts and instructions, LLMs are forming the bedrock of a new suite of data analysis tools. more

1 comment
Confirmed & scheduled
13 Jun 2024

Session type: 30 mins talk

Unlocking the power of Real Time Feature Stores

In today’s data-centric world, businesses rely on personalization now more than ever. Whether it’s personalizing user experiences, optimizing operations, or predicting market trends, data plays a pivotal role. To harness the full potential of data, organizations are turning to real-time feature stores. In this talk, we’ll explore what real-time feature stores are, why they matter, and how we at Z… more

0 comments
Confirmed & scheduled
15 Jun 2024

Session type: 15 mins talk

Practical tips for building AI applications using LLMs - Best practices and trade-offs

Overview At KushoAI, we’ve built an AI agent that can autonomously perform API testing for you. While building this, we came across a lot of problems specific to AI applications built on top of LLMs that you don’t see anywhere else. Since this is a fairly new area of development, we had to spend a lot of time figuring out solutions for them on our own. more

2 comments
Submitted
17 Jun 2024

Session type: 30 mins talk

Establishing Causality using AI in Mental Health

This talk explores the forefront of artificial intelligence (AI) in establishing causality in mental health. By leveraging Graph Neural Networks (GNNs) and Spatio-Temporal Graph Neural Networks (STGNNs), we aim to uncover causal relationships in complex mental health causal effects. The session will cover fundamental concepts of causality, the transition from traditional GNNs to STGNNs, and the c… more

1 comment
Confirmed & scheduled
17 Jun 2024

Session type: 30 mins talk

Content Moderation Systems at Scale

We heavily rely on the Web for meeting our information needs today. Examples include Wikipedia, Twitter, Instagram, Youtube, Google Maps etc. All of these are platforms where millions of users post billions of pieces of content every day on a wide range of topics. The content is consumed by hundreds of millions of users. While a rich source of information, these platforms are also easy targets fo… more

0 comments
Confirmed & scheduled
19 Jun 2024

Session type: 30 mins talk

LLM's Anywhere: Browser Deployment with Wasm & WebGPU

Description In today’s interconnected world, deploying and accessing machine learning (ML) models efficiently poses major challenges. Traditional methods rely on cloud GPU clusters and constant internet connectivity. However, WebAssembly (Wasm) and WebGPU technologies are revolution more

0 comments
Submitted
21 Jun 2024

Session type: 30 mins talk

Getting dimensions right! A sneak peak at entity resolution in the warehouse and datalake

Real world data contains multiple records belonging to the same customer. These records can be in single or multiple systems and they have variations across fields, which makes it hard to combine them together, especially with growing data volumes. This hurts customer analytics - establishing lifetime value, loyalty programs, or marketing channels is impossible when the base data is not linked. N… more

1 comment
Submitted
22 Jun 2024

Session type: 30 mins talk

From Foundation to the Future: The Evolution of Dream11's Data Platform

Outline Introduction (5 minutes) Brief overview of Dream11 more

0 comments
Confirmed & scheduled
26 Jun 2024

Session type: 30 mins talk

Building and Deploying LLM Applications: From Concept to Production - AMA with Mixture-of-Experts

Session Overview AMA with Mixture-of-Experts on Building Building and Deploying LLM Applications: From Concept to Production was held on 24th July, at BIC, as a part of The Fifth Elephant 2024 Annual Conference at BIC. more

1 comment
Confirmed
04 Jul 2024

Session type: Birds of Feather (BOF) session

Imagining the Future of AI in India

The rush for building AI is taking over every business organization to adapt to a new era of automation. The push for AI is going to take over the economy and society at large with applications of AI in every sector. This brings many important questions of production of AI, from large scale data centers to data sets required to train these complex mathematical models. In this context, how do we i… more

0 comments
Confirmed & scheduled
06 Jul 2024

Session type: Birds of Feather (BOF) session

Jul 2024

8 Mon

9 Tue

10 Wed

11 Thu

12 Fri

13 Sat 09:00 AM – 06:05 PM IST

14 Sun

Hosted by

The Fifth Elephant

Jumpstart better data engineering and AI futures

Supported by

Gold Sponsor

Atlassian

Atlassian unleashes the potential of every team. Our agile & DevOps, IT service management and work management software helps teams organize, discuss, and compl

Silver Sponsor

Google

Together, we can build for everyone.

Workshop sponsor

Datastax

Datastax, the real-time AI Company.

Lanyard Sponsor

Uber

We reimagine the way the world moves for the better.

Sponsor

Monster API

MonsterAPI is an easy and cost-effective GenAI computing platform designed for developers to quickly fine-tune, evaluate and deploy LLMs for businesses.

Community Partner

FOSS United Foundation

FOSS United is a non-profit foundation that aims at promoting and strengthening the Free and Open Source Software (FOSS) ecosystem in India. more

Beverage Partner

BONOMI

BONOMI is a ready to drink beverage brand based out of Bangalore. Our first segment into the beverage category is ready to drink cold brew coffee.