Submissions

Accepting submissions till 31 Dec 2020, 11:59 PM

Not accepting submissions

If you missed the deadline for submitting your talk for The Fifth Elephant 2019 -- to be held in Bangalore on 25 and 26 July -- you can propose a talk here. We are accepting talks on: Data engineering -- engineering and architecture approaches; problems that teams were attempting to solve (and ther… expand

If you missed the deadline for submitting your talk for The Fifth Elephant 2019 -- to be held in Bangalore on 25 and 26 July -- you can propose a talk here.

We are accepting talks on:

Data engineering -- engineering and architecture approaches; problems that teams were attempting to solve (and therefore the solutions that they built).
ML engineering -- engineering and architecture approaches; problems that teams were attempting to solve (and therefore the solutions that they built).
Data science -- and its applications in diverse domains.
Open source algorithms
Data privacy and its solutions in technology; engineering implementations of HIPPA compliance, GDPR and other data protection frameworks.
Data security -- standards, approaches to solving data security, challenges and problems to solve for data security at scale.
Business intelligence -- how non-technical teams are accessing data in companies to mine intelligence; approaches to BI; real-life case studies and applications of BI; what counts as business intelligence for businesses.
Decision science.

Make a submission

Accepting submissions till 31 Dec 2020, 11:59 PM

Building a large-scale Data as a Service (DaaS) platform to consistently deliver high-quality datasets

As a provider of Competitive Intelligence as a Service to eCommerce businesses and consumer brands, DataWeave aggregates and analyses product catalog data from eCommerce websites each day at massive scale. Once aggregated, this data is fed into a complex process of extraction, transformation, machine learning, and analyses. These operations are performed on a consistent basis to provide our custo… more

4 comments
Rejected
01 May 2019

Session type: Short talk of 20 mins

Finding needles in high dimensional haystacks: Product Matching in Retail

Matching the same and similar products is a problem fundamental to the online retail industry with multiple applications spanning across price optimization, recommending similar or substitute products to customers, understanding gaps in product assortments, and counterfeit product detection. Given that that there are no standard product identifiers, catalog data is often noisy, incomplete and non… more

10 comments
Rejected
01 May 2019

Session type: Short talk of 20 mins

Websites to Datasets

11 comments
Rejected
09 May 2019

Session type: Short talk of 20 mins

A Journey of Building Dream11's Data Platform

Dream11 is India’s biggest fantasy sports platform that allows users to play fantasy cricket, hockey, football, kabaddi and basketball. Our total user base is over 70 million and expected to cross 100 million by end of 2019. more

2 comments
Rejected
10 May 2019

Deep Learning powered Genomic Research

The event disease happens when there is a slip in the finely orchestrated dance between physiology, environment and genes. Treatment with chemicals (natural, synthetic or combination) solved some diseases but others persisted and got propagated along the generations. Molecular basis of disease became prime center of studies to understand and to analyze root cause. Cancer also showed a way that or… more

0 comments
Submitted
21 May 2019

Session type: Workshop

Panel Discussion around Healthcare Analytics

Panel Discussion around Healthcare Analytics Outline more

0 comments
Submitted
21 May 2019

Session type: Birds of a Feather session of 1 hour

Interpretable NLP Models

Deep learning models are always known to be a black box and lacks interpretability compared to traditional machine learning models. So,There is alway a hesitation in adopting deep learning models in user facing applications (especially medical applications). Recent progress in NLP with the advent of Attention based models , LIME and other techniques have helped to solve this. I would like to walk… more

3 comments
Rejected
31 May 2019

Session type: Tutorial

Real-Time DataQuality on Flink

My use case is to provide monitoring, and improving the overall search data quality, also to find the unusual patterns of user’s search behavior, and notifying the intent on-site back to the respective business stakeholders. To achieve the same, I explored various big data processing engines, which can process the huge data with complex business logic in real time. Eventually, I used Flink Stream… more

4 comments
Rejected
17 Jun 2019

Session type: Full talk of 40 mins

Building a Location Intelligence Platform for audience segmentation

The ROI of OOH (Out of Home Advertisement) depends on precise and intelligent targeting of advertisements. The media buyers therefore require detailed understanding and visibility of the audiences across various attributes so that they can then plan their OOH media buy to specifically target a selected set of audiences. Location information of the user, device level audience data, enriched with r… more

4 comments
Rejected
18 Jun 2019

Session type: Short talk of 20 mins

How to make a kickass data platform with spark and S3

In this talk, we will explore the advantages and challenges faced while running an in-house data platform using spark and S3. We will also discuss how to add some essential features to your platform like autoscaling and access control. The latter part of the talk will also address some ways to organise data in S3, storage formats for big data and indexing to improve read performance for big-data … more

5 comments
Under evaluation
01 Jul 2019

Session type: Full talk of 40 mins

Anomaly Detection at Scale: Architectural Choices for Data Pipelines for 7B events per day

Cloud-native applications. Multiple Cloud providers. Hybrid Cloud. 1000s of VMs and containers. Complex network policies. Millions of connections and requests in any given time window. This is the typical situation faced by a Security Operations Control (SOC) Analyst every single day. In this talk, the speaker talks about the high-availability and highly scalable data pipelines that he built for … more

0 comments
Submitted
02 Jul 2019

Session type: Full talk of 40 mins

Deploying Deep Learning models on the Edge (Android, IOS, ...)

The ability to train the task specific deep learning models is very easy these days, with the wide range of available libraries and documentation around it. But, the difficulty lies in bringing it to production ready mode. Especially, if the application concentrates on Mobile platform. Though there are existing wrappers of certain libraries to make them work, but, as of now, they are slow and use… more

0 comments
Submitted
05 Jul 2019

Session type: Full talk of 40 mins

Machine Learning Model Management with MLflow

Background Data is the new oil and its size is growing exponentially day by day. Most of the companies are leveraging data science capabilities extensively to affect business decisions, perform audits on ML patterns, decode faults in business logic, and more. They run large number of machine learning model to produce results. more

0 comments
Submitted
15 Jul 2019

Session type: Tutorial

Building a data pipeline inside and outside a vehicle

Ather 450 is a smart electric vehicle with data intensive features on the vehicle as well as on the cloud/mobile app. On the vehicle, the on-board software uses the vehicle data to make decisions regarding the vehicle behaviour and safety, while giving some user delight features like auto-indicator. Via the cloud, user has a mobile app using which the vehicle can be monitored and their ride stati… more

1 comment
Submitted
09 Jul 2019

Session type: Short talk of 20 mins

Data Science for the discretionary managers: Lessons from a 60 trillion$ traditional industry resistant to change and facing the quant threat

Investment management is a 60 Trillion$ industry, and despite the recent advancements in data science and machine learning, still remains fairly discretionary. Untill recently, less 20% of the funds called themselves quantitative. However, there is an absolutely massive transformation taking place right now within the discretionary investment management industry. Quantitative and systematic strat… more

0 comments
Submitted
19 Jul 2019

Session type: Short talk of 20 mins

Case study: Outbound logistics optimization for multi depot problem with time window

Case study: Outbound logistics optimization for multi depot problem with time window more

0 comments
Submitted
25 Jul 2019

Session type: Short talk of 20 mins

Hosted by

The Fifth Elephant

Jumpstart better data engineering and AI futures