Submissions
Make a submission

Accepting submissions till 15 Jun 2019, 01:00 PM

NIMHANS Convention Centre, Bengaluru

Tickets

Loading…

##The eighth edition of The Fifth Elephant will be held in Bangalore on 25 and 26 July. A thousand data scientists, ML engineers, data engineers and analysts will gather at the NIMHANS Convention Centre in Bangalore to discuss:

  1. Model management, including data cleaning, instrumentation and productionizing data science.
  2. Bad data and case studies of failure in building data products.
  3. Identifying and handling fraud + data security at scale
  4. Applications of data science in agriculture, media and marketing, supply chain, geo-location, SaaS and e-commerce.
  5. Feature engineering and ML platforms.
  6. What it takes to create data-driven cultures in organizations of different scales.

##Highlights:

1. Meet Peter Wang, co-founder of Anaconda Inc, and learn about why data privacy is the first step towards robust data management; the journey of building Anaconda; and Anaconda in enterprise.
2. Talk to the Fulfillment and Supply Group (FSG) team from Flipkart, and learn about their work with platform engineering where ground truths are the source of data.
3. Attend tutorials on Deep Learning with RedisAI; TransmorgifyAI, Salesforce’s open source AutoML.
4. Discuss interesting problems to solve with data science in agriculture, SaaS perspective on multi-tenancy in Machine Learning (with the Freshworks team), bias in intent classification and recommendations.
5. Meet data science, data engineering and product teams from sponsoring companies to understand how they are handling data and leveraging intelligence from data to solve interesting problems.

##Why you should attend?

  1. Network with peers and practitioners from the data ecosystem
  2. Share approaches to solving expensive problems such as cleanliness of training data, model management and versioning data
  3. Demo your ideas in the demo session
  4. Join Birds of Feather (BOF) sessions to have productive discussions on focussed topics. Or, start your own Birds of Feather (BOF) session.

##Full schedule published here: https://hasgeek.com/fifthelephant/2019/schedule

##Contact details:
For more information about The Fifth Elephant, sponsorships, or any other information call +91-7676332020 or email info@hasgeek.com

#Sponsors:

Sponsorship Deck.
Email sales@hasgeek.com for bulk ticket purchases, and sponsoring 2019 edition of JSFoo:VueDay.

JSFoo:VueDay 2019 sponsors:

#Platinum Sponsor

Anatta

#Community Sponsors

Salesforce Ericsson freshworks
databricks

#Exhibition Sponsors

Sapient Atlassian GO-JEK
Bayer

#Bronze Sponsor

Sumologic Walmart Labs Atlan
Simpl Great Learning

#Community Sponsors

Elastic Anaconda Aruba Networks

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Accepting submissions till 15 Jun 2019, 01:00 PM

Not accepting submissions

Manoj Kumar

TuneIn: How to get your jobs tuned while sleeping

Have you ever tuned a Spark, Hive or Pig job? If yes, then you must know that it is a never ending cycle of executing the job, observing the running job, making sense out of hundreds of Spark/Hadoop metrics and then re-run it with the better parameters. Imagine doing this for tens of thousands of jobs. Manually doing performance optimization at this scale is tedious, requires significant expertis… more
  • 15 comments
  • Rejected
  • 19 Sep 2018
Session type: Full talk of 40 mins

Jaskaran Singh

Harnessing implementation Patterns in Data Science

Transforming data science and big data implementations into generic and reusable blueprints for generating data pipelines which save developers cost and time accompanied by Generic CICD (Continuous Integration and continuous deployment) pipeline for deploying these to any cloud in minutes . more
  • 3 comments
  • Rejected
  • 23 Aug 2018

Andrew Murphy

Human Centered Leadership - Emotional Intelligence for the Technical Mind

There’s a huge problem in our industry, I call it “inertia-driven leadership”. more
  • 2 comments
  • Rejected
  • 06 Jan 2019

Andrew Murphy

Communicating anything to anyone. How to communicate effectively and efficiently

Everyone thinks they are a good at communication, but... how many times have you been at an event talking to someone you really didn’t want to talk to? Been sold to by someone who didn’t get that you weren’t interested? more
  • 1 comment
  • Rejected
  • 06 Jan 2019
Session type: Workshop

Andrew Murphy

The power of saying "I don't know"

It’s something we all struggle with, admitting we don’t know something. But I’m here to show you the power of saying “I don’t know” to people. more
  • 2 comments
  • Rejected
  • 06 Jan 2019
Session type: Short talk of 20 mins

Maulik Soneji

Using ML for Personalizing Food Search at Go-jek

GoFood, the food delivery product of Gojek is one of the largest of its kind in the world. This talk summarizes the approaches considered and lessons learnt during the design and successful experimentation of a search system that uses ML to personalize the restaurant results based on the user’s food and taste preferences . more
  • 14 comments
  • Rejected
  • 13 Jan 2019
Session type: Full talk of 40 mins

krupal Modi

Machine Learning in Production : Fundamentals and Updates

< Work in Progress > When both technology and ecosystem are rapidly evolving, one of the prerequisites to excel is to focus on building things that either lasts longer or truely differentiates itself amongst currently available alternatives. If you are a Machine Learning practitioner, it’s not hard to end up in a situation where several research papers and prototypes of a new algorithms are out o… more
  • 1 comment
  • Rejected
  • 12 Jan 2019

Aiko Klostermann

The Deep Learning Showdown: How to pick the right tool for the job?

When you have a data centric problem to solve and you look for a technology to support you with this: The machine intelligence landscape can be overwhelming. I analysed the landscape using a data driven approach and condensed the outcome into a consumable from. Additionally I came to the conclusion that there is a set of questions you have to ask yourself to make the best possible choice for your… more
  • 1 comment
  • Rejected
  • 01 Feb 2019

Jacob Joseph

Leveraging Power of Analytics for Martech

Marketing Technology has undergone a technological revolution over the past 10 – 15 years. Today marketers are able to track the smallest of digital footprint like scrolls on mobile or web apps. Armed with the digital trove of user behavior data, marketers are trying to nudge and retain their users across the customer lifecycle. more
  • 19 comments
  • Confirmed & scheduled
  • 11 Feb 2019
Session type: Full talk of 40 mins

Nilesh Trivedi

Building a personalized learning system using a concept graph, and latest research in cognitive science

I have been actively learning new things (beyond what was required for my formal education) since I was a teenager. A few things I have learned in this time are: mathematics, engineering, economics, philosophy, public speaking, a dozen or so musical instruments, a dozen or so programming languages). But the list of things I am yet to learn is not getting any shorter. I realized that I had to get … more
  • 6 comments
  • Rejected
  • 25 Feb 2019
Session type: Tutorial

Ravi Ramchandran

Data Quality Management @Walmart Data Lake

Erroneous decisions made from bad data are not only inconvenient, but also extremely costly. According to Gartner research, “the average financial impact of poor data quality on organizations is $9.7 million per year.” In additional research for organizations that Gartner has surveyed, the analyst firm “estimate that poor-quality data is costing them on average $14.2 million annually.” Definetely… more
  • 26 comments
  • Rejected
  • 27 Feb 2019
Session type: Full talk of 40 mins

Pradip Thoke

A Journey of Building Dream11's Data Platform

Dream11 is India’s biggest fantasy sports platform that allows users to play fantasy cricket, hockey, football, kabaddi and basketball. Our total user base is over 50 million and expected to cross 100 million by end of 2019. more
  • 2 comments
  • Rejected
  • 27 Feb 2019

Gopal Sakarkar

AI Understand Human

Artificial Intelligence and Machine Learning are the cutting edge technologies of today’s world . Using speech processing and recognition , we can control the various electrical equipments at home. Amazon Alexa is very handy and useful AI based product that understand the human communication, human speech commands and replays the accurate information. This talk will deeply focused on working of A… more
  • 4 comments
  • Rejected
  • 06 Mar 2019

Vinodh Kumar R

The Art of Applying Data Science

As more and more organizations have begun embracing data science to solve a wide spectrum of problems across their business, there is still a gap between the potential that data science holds and the actual outcome that organization sees while applying data science. In this talk, I will try to draw on my experience of having worked and built data science products over the last decade - from build… more
  • 2 comments
  • Rejected
  • 14 Mar 2019

Vikram Vij

Exploring the un-conventional: End to End learning architectures for automatic speech recognition

Speech recognition is a challenging area, where accuracies have risen dramatically with the use of deep learning over the last decade, but there are still many areas of improvement. We start with the basics of speech recognition and the design of a conventional speech recognition system, comprising of acoustic modeling, language modeling, lexicon (pronunciation model) and decoder. To improve the … more
  • 7 comments
  • Rejected
  • 17 Mar 2019
Session type: Lecture Session type: Full talk of 40 mins

Yuvaraj Loganathan

Why we went ahead with Apache Pulsar(streaming platform 2.0 ) instead of Apache Kafka

In this talk we will be discusing about different ways of asynchronus communications patterns especially queuing and pub/sub streaming platforms. We walk about kafka and it use cases. We will move on to architectural limitations of Kafka and then will discuss more about Apache Pulsar and how it overcomes the limitations of Kafka. Finally we will take you through the bells and whistles of Apache P… more
  • 2 comments
  • Rejected
  • 26 Mar 2019

Ravishankar Suribabu

Video thumbnail

Managing Infrastructure for Machine Learning Platform at Walmart scale - Using Kubernetes as the backbone

One of the most critical challenges in bringing Machine Learning to practice is to avoid the various technical debt traps which the data science teams focus on in their day to day jobs. Building a Machine Learning Platform at Walmart has a single agenda i.e. to make it easy for data scientists to use the company’s data to train/build new ML models at scale and making the “single click” deployment… more
  • 19 comments
  • Rejected
  • 27 Mar 2019
Session type: Full talk of 40 mins

Shivji Kumar Jha

Metadata Catalogue - Making sense of all your data, whether stream or store, the self serve way

What This talk presents the case for a central metadata catalogue repository for metadata discovery, cataloguing, and control service. This is another step towards enabling self service from your streams. We did this by forking Apache Atla, establishing a central metadata repository to capture metadata across datasets and surface it through a single platform to simplify data discovery and trace i… more
  • 2 comments
  • Rejected
  • 30 Mar 2019

Shivji Kumar Jha

Schema Registry and the nitty gritty details of schema formats

The data ecosystem has come along way in last decade. The ride from structured to unstructured data has been quick. And kafka (more genrally the streaming ecosystem) has been at the forefront of that innovation. While the streaming architecture started with bits (== data - semantics) flowing through the network to offer flexibity the structure and semantics has caught up rather quickly. The same … more
  • 2 comments
  • Rejected
  • 30 Mar 2019
Venkata Pingali

Venkata Pingali

Anatomy of a production ML feature engineering platform

This talk addresses the following questions: What should a production ML feature engineering platform have and why? more
  • 8 comments
  • Confirmed & scheduled
  • 01 Apr 2019
Session type: Full talk of 40 mins

Neha Kumari

Improving product discovery via Hierarchical Recommendations!

A recommendation engine’s primary goal is to surface personalised & relevant content to the user, content which satisfies explicit intent as well as serendipitous content that would otherwise be invisible. E-commerce categories such as Lifestyle, have a lot of flux, the trends last for a short time window and have their demand distributed across an extensive selection. In such cases, recommending… more
  • 12 comments
  • Confirmed & scheduled
  • 05 Apr 2019
Session type: Lecture Session type: Full talk of 40 mins

Sachin Parmar

Running ML Workflows using Airflow @ Walmart

One of the most critical challenges in bringing Machine Learning to practice is to avoid the various technical debt traps which the data science teams focus on in their day to day jobs. Building a Machine Learning Platform at Walmart has a single agenda i.e. to make it easy for data scientists to use the company’s data to train/build new ML models at scale and making the deployment experience sea… more
  • 6 comments
  • Rejected
  • 06 Apr 2019
Session type: Full talk of 40 mins

Regunath Balasubramanian

Anatomy of a Reseller Bot - detecting and protecting customer experience at the scale of an eCommerce Flash sale

Flipkart pioneered online flash sales of Mobile phones in India. Many models eventually went on to become bestsellers, breaking records for most units sold in a matter of seconds. While we were scaling our systems to meet the spikes in user traffic to handle such sales, we were unknowingly also serving non-human bot traffic. These bots were run by resellers to buy the high-demand phones posing as… more
  • 3 comments
  • Rejected
  • 09 Apr 2019

Kumar Puspesh

10 steps to build-your-own data pipeline - for day 1 of your startup

We are a gaming company making mass market social games. Since being in a consumer market where user experience is the the key, we had to rely heavily on data from Day 1 of game/product launches. This is the reason we actually built our data infrastructure in parallel to games/products and had it ready for production usage from begining itself. We relied heavily on ready-to-use systems but at the… more
  • 3 comments
  • Confirmed & scheduled
  • 14 Jan 2019
Session type: Full talk of 40 mins

Krishna Sangeeth KS

Video thumbnail

The last mile problem in ML

“We have built a machine learning model, What next?” more
  • 3 comments
  • Rejected
  • 10 Apr 2019

Ishita Mathur

Video thumbnail

How GO-FOOD built a Query Semantics Engine to help you find food faster

Context: The Search problem GOJEK is a SuperApp: 19+ apps within an umbrella app. One of these is GO-FOOD, the first food delivery service in Indonesia and the largest food delivery service in Southeast Asia. There are over 300 thousand restaurants on the platform with a total of over 16 million dishes between them. more
  • 12 comments
  • Confirmed & scheduled
  • 10 Apr 2019
Session type: Full talk of 40 mins

Shrashti Gupta

Spark on Kubernetes

Typical data processing and machine learning workloads includes heavy setups like Hadoop stack, Kafka, NoSQL databases, Application APIs and so on. Traditionally, these workloads run on top of dedicated setups which adds overhead to IT teams as well as developers in managing multiple clusters. It is a need of the hour to develop unified solution to manage all the workloads on single control plane… more
  • 4 comments
  • Rejected
  • 10 Apr 2019
Session type: Full talk of 40 mins

Ravi Ranjan

Video thumbnail

Machine Learning Model Management with MLflow

Background Data is the new oil and its size is growing exponentially day by day. Most of the companies are leveraging data science capabilities extensively to affect business decisions, perform audits on ML patterns, decode faults in business logic, and more. They run large number of machine learning model to produce results. more
  • 13 comments
  • Rejected
  • 10 Apr 2019
Session type: BOF session of 1 hour

Venkateshan

Solving the vehicle routing problem for optimizing shipment delivery

At each Flipkart Delivery hub, an important task is determining the assignment of shipments to vehicles and the specific routes taken by vehicles to deliver the items to customers. Informally, a good assignment is one that minimizes the total distance while also distributing the shipments evenly across the different vehicles and does not have too many overlapping or criss-crossing routes. We form… more
  • 4 comments
  • Confirmed & scheduled
  • 11 Apr 2019
Session type: Full talk of 40 mins

Ginette V

Optimisation using Julia

While planning their marketing campaigns, our clients had to understand how their marketing spend affects their KPIs. We created models to understand the effect of individual marketing channels such as TV, Radio, Digital etc on KPIs like sales, qualified reach or profits. We had to help them to build optimised brand plans and campaign plans that use the allocated budget effectively. more
  • 6 comments
  • Rejected
  • 12 Apr 2019
Session type: Lecture Session type: Short talk of 20 mins

Saarthak Puri

Alerting @ AppDynamics: Simplifying User Experience for Data Intensive Applications

AppDynamics builds products that help large enterprises monitor their Application environments. A big part of monitoring is to be alerted when something goes wrong. AppDynamics provides tools that help users build these alerts, and over the last ten years, they have been using these tools to build alerts for mission critical applications. more
  • 1 comment
  • Rejected
  • 12 Apr 2019

Soma Dhavala

Video thumbnail

Building Enterprise grade ML Apps : Tools and Architectures

ML Products are unfinished by design. ML Centric quality attributes such as MSE and F1-score etcc are necessary but not sufficient. How do we address this fundamentally unsettling characteristic? And the existing Data Science practices are not scalable beyond the confines. In the first part of the talk, an axiomatic framework is provided to address these issues. more
  • 2 comments
  • Rejected
  • 13 Apr 2019
Session type: Full talk of 40 mins

Soma Dhavala

Let's dope it: Interoperable ML via Deep Learning

One of the biggest hurdles to reducing time-to-market of an ML Product is the two language problem. Generaly speaking, the tech stacks of the Producers of the ML models and its Consumers are different. Say, a DataScientist may work with Python, but a Production Engineer may want it in a JVM language. There are multiple approaches to solving this problem. Languages like Julia offer the expressiven… more
  • 5 comments
  • Rejected
  • 13 Apr 2019
Session type: Full talk of 40 mins

Krishna Durai

Video thumbnail

Kubeflow: ML on Kubernetes

Data science software teams find it tedious to implement ML workflows in a repeatable, maintainable and sustainable manner. Even if such a platform is developed, it has challenges with further inclusion of newer workflows or capabilities, portability across various infrastructure platforms (cloud, on-premise, and hybrid), scalability in terms of compute resources, and managing the number of teams… more
  • 6 comments
  • Rejected
  • 14 Apr 2019
Session type: Full talk of 40 mins

Rajdeep Dua

Tutorial: Meet TransmogrifAI, Open Source AutoML powering Salesforce Einstein

In this talk we will explain how TransmogrifAI - AutoML library on top of Apache spark helps build automated machine learning pipelines with features engineering, feature selection. It provides Automatic Model selection along with automated model hyper parameter tuning. more
  • 2 comments
  • Confirmed & scheduled
  • 14 Apr 2019
Session type: Tutorial

Ashish Tadose

The art of abstraction to handle database and storage system chaos

With growing data volumes and varying needs of data storage and access patterns gave rise to adoption of diverse databases such as key value, wide columns, document, graph and so on. Also, with increasing adoption of public clouds organizations started leveraging flexible storage mediums such as HDFS & Object stores. There is a dire need of query engine in the analytics platform which can query a… more
  • 2 comments
  • Rejected
  • 14 Apr 2019

Namrata Hanspal

Model interpretability

The choice we make Complex machine learning models work very well at prediction and classification tasks but become really hard to interpret. On the other hand simpler models are easier to interpret but less accurate and hence oftentimes we are made to take a call between interpretability and accuracy. more
  • 8 comments
  • Rejected
  • 15 Apr 2019
Session type: Short talk of 20 mins

Avinash Ramakanth

A journey through Cosmos to understand users.

This talk covers the journey of building a cloud native user feedback system for Inmobi DSP. The challenges involved and the need for sharing these learnings can be appreciated by observing that a typical DSP processes anywhere from 250,000 - 1,000,000 queries per second, with an average response time of sub 50 milliseconds. To make intelligent decisions in such high throughput low latency system… more
  • 9 comments
  • Confirmed & scheduled
  • 15 Apr 2019
Session type: Full talk of 40 mins

Devashish Sood

Data-Driven Sourcing of Candidates for Recruitment

We cover how we are using social media data to source candidates and details on how we manage the data-pipeline, trained the models, built the webapp, handled data-security and GDPR and Legal. Our project manages a huge amount(~1TB) of data but is used by a small amount of users. more
  • 2 comments
  • Rejected
  • 15 Apr 2019
Session type: Full talk of 40 mins

Joy Mustafi

Video thumbnail

The Artificial Intelligence Ecosystem driven by Data Science Community

MUST Research is dedicated to promote excellence and competence in the field of data science, cognitive computing, artificial intelligence, machine learning, advanced analytics for the benefit of the society. MUST is to build an ecosystem to enable interaction between academia and enterprise, help them in resolving problems, as well make them aware of the latest developments in the cognitive era … more
  • 2 comments
  • Rejected
  • 15 Apr 2019
Session type: Full talk of 40 mins

Pushker Ravindra

Data Science Best Practices for R and Python

How many times did you feel that you were not able to understand someone else’s code or sometimes not even your own? It’s mostly because of bad/no documentation and not following the best practices. Here I will be demonstrating some of the best practices in Data Science, for R and Python, the two most important programming languages in the world for Data Science, which would help in building sust… more
  • 2 comments
  • Rejected
  • 15 Apr 2019
Session type: Workshop

Akash Khandelwal

Video thumbnail

Maintaining Data Pipelines' Sanity at Scale : How Validations and Metric Visualization came to our rescue!

Have you ever been through a nightmare when corrupt data from an upstream source led to a rogue index push to prod? more
  • 13 comments
  • Rejected
  • 15 Apr 2019
Session type: Lecture Session type: Full talk of 40 mins

abishekk92

Video thumbnail

Similarity Search for Product Matching @ Semantics3

One of the major offerings of Semantics3 is our universal product data catalog gathered through large scale indexing of the public web. For each catalog, duplicated entries of the same product across multiple retailers need to be merged/removed. In this talk, we will go through the technical challenges in such a large scale “product matching” system, where millions of products are often compared … more
  • 16 comments
  • Confirmed & scheduled
  • 15 Apr 2019
Session type: Lecture Session type: Full talk of 40 mins

Suvrat Hiran

Fuzzy Deduplication of records at scale

Quality of the data stored have significant implications to a product/system that relies on information. Unfortunately, data is entred erroneously into the system creating duplicate entry. This leads to decrease in the quality of data retrieval for any product/system. Particularly for Freshworks, we are looking at incorporating deduplication as a feature in our CRM product, Freshsales. Here dedup… more
  • 4 comments
  • Rejected
  • 15 Apr 2019
Session type: Lecture Session type: Short talk of 20 mins

Agam Jain

Video thumbnail

Building Robust, Reliable Data Pipelines

This talk is about sharing our learnings and some best practices we have built over the years working with massive volume and every changing schema of data. What we are not going to discuss is specifics of what actually technological choices we made. Or, how we scaled out system 10x year on year. Or, how we brought down the latency in processing of our data to half. Zapr has profiled millions of … more
  • 11 comments
  • Confirmed & scheduled
  • 15 Apr 2019
Session type: Short talk of 20 mins

Sarthak Dev

Designing a Data Pipeline at Scale

At Freshworks, we deal with petabytes of data everyday. For our data science teams to read online data, run ETL jobs and push out relevant predictions in quick time, it’s imperative to run a strong and efficient data pipeline. In this talk, we’ll go through the best practices in designing and architecting such pipelines. more
  • 1 comment
  • Rejected
  • 15 Apr 2019
Session type: Full talk of 40 mins

Paritosh Umesan

Data enabled Journey to elevate Developer Experience

This crisp talk focuses on the challenges (or opportunities) which IT4IT and Engineering Productivity organizations face (or should seize). more
  • 3 comments
  • Rejected
  • 15 Apr 2019
Session type: Short talk of 20 mins

Jagadeesh Rajarajan

A.I Insights for Sales

At Freshworks, we are building Freddy for Freshsales - An intelligent sales assistant. We will talk about the problems we solve using A.I, why we choose these problems and how we solve them. more
  • 1 comment
  • Rejected
  • 15 Apr 2019

Varun Nathan

Developing a bot that can answer support queries and aid in decision making with analytics

Responding to repetitive queries from customers can overload the support team. Developing the capability to handle such repetitive queries can significantly enhance the productivity of support agents and they can utilise their time in resolving problems that are more challenging and involved. This talk will focus on the modelling approach that we at Freshworks took to develop a bot that has the a… more
  • 22 comments
  • Rejected
  • 15 Apr 2019
Session type: Discussion Session type: Full talk of 40 mins

Adarsh Dattatri

Extract calendar events from free-form text (chats/emails) to automate scheduling

Sales teams activities include scheduling meetings with prospects for product demos, resolving queries and doubts about the product, initial setup. At Freshworks, we use NLU to automatically detecting meeting intent within emails/chat and generate a calendar event. This talk is about the pipeline and tools used to engineer this system. more
  • 1 comment
  • Rejected
  • 15 Apr 2019

sawinder kaur

Story of Building a Telecom Data Analytics Solution

Telecom data is quite complex - consisting of hundreds of continuous and categorical variables that capture the details of millions of users consisting of plans, services, roaming, phone/SMS usage, revenue, and, cost, etc. Through interactions with customer leadership, we arrived on the business objective of our solution as optimizing the existing plans and services and maximizing the profit. We … more
  • 6 comments
  • Rejected
  • 15 Apr 2019

Vishal Gupta

Accelerating Hiring with Data Science

At Freshworks, we receive more than 1000 applications every week. This leads to a lot of applications for our Talent Acquisition teams to process, which can be difficult. Conventionally, candidate screening at Freshworks has involved a manual review of the candidate’s resume/portfolio which cannot be scaled for smaller HR teams. We experimented with making this process smoother by implementing an… more
  • 1 comment
  • Rejected
  • 15 Apr 2019
Session type: Short talk of 20 mins

Akash Khandelwal

Video thumbnail

Incubation to Production : Building Data Products for ever changing business @Flipkart

This talk will cover our journey of taking data products from incubation to production. We saw that via externalizing and crowdsourcing in-lab experiments, we were able to spinoff completely new products via quick prototyping, thereby preparing us for fast evolving business environment as Flipkart grew exponentially. more
  • 1 comment
  • Rejected
  • 15 Apr 2019
Session type: Lecture Session type: BOF session of 1 hour

\AbdulMajedRaja

From ML Dashboards to ML Web Apps - R with Shiny

One of the beautiful gifts that R has got (that Python misses) is the package – Shiny. Shiny is an R package that makes it easy to build interactive web apps straight from R. This session will help you build ML solutions and Dashboards as web apps using R Shiny. more
  • 0 comments
  • Rejected
  • 15 Apr 2019

\AbdulMajedRaja

Introduction to R for Data Science [Workshop]

R programming is one of the most popular programming languages used in Data Science. Known for its simplicity and easy to take off working environment, R has been the language of choice of many non-programmers and its Rich ecosystem enables it to perform variety of Data Science related tasks. The objective of this workshop is to help you get started with R for you to move forward with your Data S… more
  • 5 comments
  • Rejected
  • 15 Apr 2019
Session type: Workshop

\AbdulMajedRaja

\From ML Dashboards to ML Web Apps - R with Shiny

One of the beautiful gifts that R has got (that Python misses) is the package – Shiny. Shiny is an R package that makes it easy to build interactive web apps straight from R. This session will help you build ML solutions and Dashboards as web apps using R Shiny. more
  • 4 comments
  • Rejected
  • 15 Apr 2019
Session type: Workshop

\AbdulMajedRaja

Video thumbnail

What happens out there? In the Real-World, With R

This talk contains two sections predominantly - 1st explaining what’s all (non-obvious) that are possible with R and 2nd, How well-known organizations are using R in their company. R is one of the most popular programming languages preferred in Data Science / Analytics. more
  • 5 comments
  • Rejected
  • 15 Apr 2019
Session type: Tutorial

\AbdulMajedRaja

Become Language Agnostic by Combining the Power of R with Python using Reticulate

Language Wars have always been there for ages and it’s got a new candidate with Data science booming - R vs Python. While the fans are fighting R vs Python, the creators (Hadley Wickham (Chief DS @ RStudio) and Wes McKinney (Creator of Pandas Project)) are working together as Ursa Labs team to create open source data science tools. A similar effort by RStudio has given birth to Reticulate (R Inte… more
  • 6 comments
  • Rejected
  • 15 Apr 2019
Session type: Workshop

Arvind Aravamudhan

Democratizing ML at Freshworks

The data journey usually begins with raw data, advances to data analytics and then matures to data science. The key for reaching data science maturity is to organize and store data for large scale crunching. ML/AI being one of the key growth drivers for Freshworks, in the presentation I will walk through how we solved the data organization and access problem for ML/AI use cases by building our ow… more
  • 4 comments
  • Rejected
  • 15 Apr 2019
Session type: Short talk of 20 mins

Deepak Sharma

Scalable NLP Pipeline for Building Catalogue for MSMEs

We want to build catalogue for millions of MSMEs across India. To achieve this we are bootstrapping the catalogue from raw product descriptions provided by inventory of current customers. This is a rich source of product entities. However since this data is specific to each customer, it is highly contextual with little common grammar. This makes it extremely difficult to identify a product entity… more
  • 13 comments
  • Rejected
  • 15 Apr 2019
Session type: Lecture Session type: Tutorial

\AbdulMajedRaja

Video thumbnail

What's Machine Learning Bias?

We have been constantly told this statement “Computers don’t lie”. Yes in fact Computers don’t lie, but neither does it speak the truth. A computer does what its Master programs it to do. Similarly, A model wouldn’t lie unless the Machine Learning Engineer doesn’t want it to lie. more
  • 20 comments
  • Rejected
  • 16 Apr 2019
Session type: Full talk of 40 mins

Raksha M P

Video thumbnail

Verifiable Logs and DLT: A recipe for smashing UCC using hashing

An “Unsolicited Commercial Communication”(UCC) means a commercial communication which a Subscriber opts not to receive. For a long time, Telecom Regulatory Authority of India (TRAI) has been a centralized regulating body for any Commercial Communication and now has a mandated for a DLT based solution to mainly overcome problems of cost of regulation apart from establishing proofs more
  • 6 comments
  • Rejected
  • 16 Apr 2019
Session type: Lecture Session type: Short talk of 20 mins

Siddhant Panda

Text Classification, Interpretability, and Summarisation at Scale

The Freshdesk product is used by over 150,000 customers for resolving customer support tickets. Each customer configures workflows within the product that are specific to their approach to ticket resolution. Traditionally, these use a hand-tuned rule-based system that serves well when a support organisation is relatively small. However, as businesses scale and customer needs become more complex, … more
  • 3 comments
  • Rejected
  • 16 Apr 2019
Session type: Short talk of 20 mins

Fasih Khatib

Ghostbusters: Optimizing debt collections with survival models

A pay-later solution like Simpl comes with risk - some customers don’t pay their bill on time. When this happens, our collections team calls them up and gently reminds them that their bill is due. Some people even try to vanish - they ghost us - without paying their bill, resulting in escalation to our skip trace team. more
  • 14 comments
  • Confirmed & scheduled
  • 13 May 2019
Session type: Full talk of 40 mins

Chris Stucchio

The final stage of grief (about bad data) is acceptance

Over the course of my career I’ve gone through the many stages of grief; I’ve become angry at the poor quality of my data, I’ve attempted to bargain with engineering/PMs/etc for better data, and I became depressed over the issue. Now I’ve reached the final stage; I accept that my data is bad. Given that my data is bad, I then attempt to model it’s badness, and use that model to correct for the bi… more
  • 20 comments
  • Confirmed & scheduled
  • 10 May 2019
Session type: Full talk of 40 mins

Shashank Jaiswal

Video thumbnail

ADAM - Bootstrapping a Deep Neural Network Sequence Labeling Model with minimal labeling

Deep Learning based models have achieved high accuracy on Named Entity Recognition tasks for natural language datasets. However, their efficacy on practical domain-specific data, like product titles, is often subpar due to several challenges - 1) labeled data is scarce or unavailable; 2) noise in the form of spelling errors, missing tokens, abbreviations etc.; 3) variance in structure (as it is n… more
  • 18 comments
  • Confirmed & scheduled
  • 08 Apr 2019
Section: Full talk Technical level: Intermediate Session type: Lecture Session type: Lecture Session type: Full talk of 40 mins

Upendra Singh

How to build blazingly fast distributed computing like Apache Spark In-house?

We at ClustrData are building extremely large scale, extremely cost sensitive analytics solutions for our end user. Being cost sensitive is of utmost importance to us and ease to user is the ultimate goal. We cater to customers who are extremely cost sensitive. Which means whatever we build needs to be super-efficient in terms cost, efficiency and performance. Keeping our design philosophy and co… more
  • 7 comments
  • Confirmed & scheduled
  • 03 May 2019
Session type: Short talk of 20 mins

Pratik Sinha

Technology to counter misinformation/disinformation

A lot of fact-checking tasks can be automated via technology as there are repeated instances of fake videos and images that are distributed with different narratives. With misinformation/disinformation killing people in India now and also being weaponised to attack the social fabric of the country, it is must that those working in various related technologies come together to fight against this m… more
  • 1 comment
  • Confirmed & scheduled
  • 17 Jun 2018
Section: Crisp talk Technical level: Intermediate

Ashish Verma

Threat detection is as easy as finding a needle in a forest (even for machine learning)

Last decade has seen an exponential rise in digital adoption of enterprises. We have moved on from just being an internet to internet of things and now internet of everything. Although connectedness has painted much brighter future but this has also provided an opportunity for cyber criminals. Cyber security has now become one of the top priorities of enterprises. But threat detection is like fin… more
  • 1 comment
  • Rejected
  • 31 May 2019
Session type: Full talk of 40 mins

Raghav Bali

Video thumbnail

Deep Diagnosis:How is Deep Learning Impacting Medical Domain and Saving Lives

Abstract The field of Deep Learning is making huge inroads in almost all spheres. What caught the world by a storm, surpassing human level performance with image classification, has today matured into a powerful tool to solve real-world problems. Today, Deep Learning is not just a research area limited to academics but a powerful tool utilized and improved by different companies/labs/institutions… more
  • 6 comments
  • Rejected
  • 31 May 2019
Session type: Full talk of 40 mins

Logesh kumar

Interpretable NLP Models

Deep learning models are always known to be a black box and lacks interpretability compared to traditional machine learning models. So,There is alway a hesitation in adopting deep learning models in user facing applications (especially medical applications). Recent progress in NLP with the advent of Attention based models , LIME and other techniques have helped to solve this. I would like to walk… more
  • 2 comments
  • Rejected
  • 31 May 2019
Session type: Tutorial

Piyush

MetaConfig driven FeatureStore with Feature compute & Serving Platform powering Machine Learning @MakeMyTrip

Developing Personalization platform for improving customer experience of millions of Indian travellers more
  • 4 comments
  • Rejected
  • 03 Jun 2019
Session type: Full talk of 40 mins

Souradip Chakraborty

Automated Catalogue Management and Image Quality Assessment using CNN and Deep Learning

Catalogue management is a very important aspect in the field of ecommerce as it helps the visitors in efficiently selecting the necessary interest items. In every retail website, all the items in the catalogue are in a particular order and orientation of different categories whose manual grouping and ordering takes a lot of time. Secondly, image quality assessment plays a very important part in c… more
  • 2 comments
  • Rejected
  • 03 Jun 2019
Session type: Short talk of 20 mins

Navinder Pal Singh Brar

Building a multi-tenant data processing and model inferencing platform with Kafka Streams

Each week 275 million people shop at Walmart, generating multi-terabytes of interaction and transaction data. In Customer Backbone team, we enable extraction, transforming and storing of data to be served to teams such as Ads and Personalisation for building various customer-centric machine learning models such as bid models, fraud detection and omnichannel reorder. At 5 Billion events/day our Ka… more
  • 5 comments
  • Rejected
  • 04 Jun 2019
Session type: Full talk of 40 mins

Navinder Pal Singh Brar

Real-time fraud detection with Kafka Streams

One of the major use cases for stream processing is real-time fraud detection. Walmart just launched a new subscription package where it provides free delivery for users who are enrolled with a monthly subscription, which can be misused sometimes. Since the fraud detection model runs on each transaction and comes with very tight SLAs, we had to increase availability in our Kafka streams cluster a… more
  • 2 comments
  • Rejected
  • 05 Jun 2019
Session type: Short talk of 20 mins

Abhishek Mungoli

Price Investment Strategy Planning with Dynamic Programming based Optimization

Operational excellence is one of the key tenets in any retail business. Promotions are a core part of any price investment strategy in a high-low market. Promotions involve cost in providing discounts and other supports. Efficiency in utilizing the budget available for the most rewarding price investment strategy is what we are driving through this paper. The investment required for reduction of … more
  • 2 comments
  • Rejected
  • 06 Jun 2019
Session type: Full talk of 40 mins

Naresh Reddy Sankapelly

Machine Learning Platform @Flipkart

Every decision at Flipkart is data driven which implies every team at Flipkart is adopting Machine Learning based solutions. Machine Learning Platform enables data scientists and engineers to build, productionize and monitor machine learning models reliably at scale. In this talk, we will walk you through the challenges faced in building ML Platform and evolution of the platform. We will also cov… more
  • 4 comments
  • Rejected
  • 06 Jun 2019
Session type: Full talk of 40 mins

Abhishek Mungoli

How GPU Computing literally saved me at work

Distributed/Parallel computing is at the heart of new technology. Every company, big or small want to make most of the technology available to them. One such niche technology is GPU computing. If used cautiously can save a lot of computing efforts and time across the applications. Business, with the boom in Machine learning/Deep learning techniques, are on the way to leverage this technology in t… more
  • 10 comments
  • Rejected
  • 06 Jun 2019
Session type: Short talk of 20 mins

Jayanta Pal

Route risks using driving data on road segments

Going out for dinner in Cincinnati during an extended stay, or planning for a long road-trip across the wild west of US, the first thing one looks at is Maps, that informs the relative distance, estimated time and congestion areas of different routes for the drive. Zendrive built state-of-the-art technologies on its huge cache of driving data from smartphones and OBD, to add a significant dimensi… more
  • 3 comments
  • Rejected
  • 20 Jun 2018
Technical level: Intermediate

Ayan Ghatak

fStream - Continuous Intelligence @ scale in Flipkart

We live in an age of ML models, deeply personalised user experiences and quick data driven business decisions. The common denominator enabling all of it is data processing systems, especially real time ones. more
  • 2 comments
  • Rejected
  • 08 May 2019
Section: Full talk Technical level: Intermediate Session type: Discussion

Souradip Chakraborty

Video thumbnail

Siamese Triple Ranking Convolution Network in Signature Forgery Detection

Identifying a credible signature match based on a base signature of a person is an age old problem. Despite recent automation and advances in this field using image recognition, a lot remains to be explored. We have developed an intelligent framework which can automatically detect a forged signature even if it is highly skilled, based on the developed feature embeddings and the corresponding algo… more
  • 2 comments
  • Rejected
  • 13 Jun 2019
Session type: Full talk of 40 mins
Arjun BM

Arjun BM

MUDPIPE - Malicious URL Detection for Phishing Identification and Prevention

Social engineering is one of the most dangerous threats facing every individual and modern organization. Phishing is a well-known, computer-based, social engineering technique. Attackers use disguised emails as a weapon to target large companies. Numerous fake websites have been developed to mimic trusted websites, with the aim of stealing financial assets from users and organizations. With the h… more
  • 10 comments
  • Confirmed & scheduled
  • 13 Jun 2019
Session type: Short talk of 20 mins

Deepthi Chand

Video thumbnail

An open Assistive translation framework for Indic Language - Samantar

India is a land of many languages. There are 23 official and much more unofficial languages prevalently used in day-to-day conversations. Unfortunately, information dissemination to the low resource languages get difficult because of the geo-spatial distances. Popular translation platforms helped to fill this gap in major languages but their efficiency is challenged by the lack of availability of… more
  • 2 comments
  • Confirmed & scheduled
  • 13 Jun 2019
Session type: Short talk of 20 mins

Shikhar Gupta

Video thumbnail

Price Recommendations - Driving Revenue Strategy Using Machine Learning

Brief Description: Pricing in hotels can result in a lot of optimisation given that there is limited inventory to sell each day. This session focuses on how Treebo developed an automated machine learning based pricing engine within 2 months and scaled it up in next 6 months to recommend real-time prices for 400 hotels. This resulted in ~26% improvement in booked revenue, 30 days in advance. more
  • 1 comment
  • Rejected
  • 13 Jun 2019
Session type: Full talk of 40 mins

Somya Anand

A journey of AI driven analytics insights engine

At Mindtickle, we deal with different persona interaction like managers, learners, admins and site owners. Given the complexity of the platform, it is difficult to keep track of the most critical activites admist of all ongoing activities. We want to build a machine assisted auto governed platform for leaders & admins to effectively run best enablement programs for their sales teams. more
  • 3 comments
  • Rejected
  • 14 Jun 2019
Session type: Full talk of 40 mins

Pruthvi Raj

Diksuchi: Data quality Monitoring platform for @scale batch data pipelines at Walmart

We the customer Backbone team at Walmart, are building customer identity and activity graph with around 20+ Billion nodes and 30 Billion edges, that works to be the lifeline of customer data for multiple pillars such as marketing, targeting, personalization, data sciences, etc. While building the graph using spark and hive pipelines, we generate many intermediate tables/states and output tables. … more
  • 3 comments
  • Rejected
  • 14 Jun 2019
Session type: Short talk of 20 mins

Vidyasagar Reddy

Video thumbnail

Using Apache Nifi to manage a real time master data foundation @ Nike

Nike has a wide variety of systems in the enterprise landscape. All these systems produce data in different shapes and sizes. We are building theNike data foundation so that we meet the below goals. more
  • 5 comments
  • Rejected
  • 14 Jun 2019
Session type: Short talk of 20 mins

ChandraSekhar Kandavilli

Airflow for the Enterprise (Nike's Journey)

Nike has a wide variety of systems in the enterprise landscape. All these systems produce data in different shapes and sizes. We are building the Nike data foundation so that we meet the below goals. more
  • 2 comments
  • Rejected
  • 14 Jun 2019
Session type: Short talk of 20 mins

Deepak Arora Proposing

CNN for Query Categorization in ​ E-Commerce

Query categorization is a fundamental problem in e-commerce. For a query, find most relevant category of products. Think about it: Apples bought from electronics do not taste sweet. Apples at Grocery Store don’t have OS. Queries have multiple tokens. Longer the query , less products supporting it. Milk 2 % and 2% Milk probably mean the same product. more
  • 6 comments
  • Rejected
  • 14 Jun 2019
Session type: Short talk of 20 mins

Nitin Gupta

It's Launched! Why do I need to continuously benchmark and monitor my computer vision model?

Open source models like Imagenet and Resnet have opened the door to enable millions of computer vision use cases. But launching enterprise computer vision application doesn’t end when the model is trained - that’s just the first step. To build an end-to-end solution, one needs to understand the appropriate steps and best practices to follow. more
  • 3 comments
  • Rejected
  • 14 Jun 2019
Session type: Short talk of 20 mins

Nandan Thakur

Video thumbnail

GuidedLDA: A Python Package using Semi-Supervised Topic Modelling by Incorporating Lexical Priors

Topic Models have a great potential for helping users understand document corpora. This potential is impeded by their purely unsupervised nature, which often leads to topics that are neither entirely meaningful nor effective in extrinsic tasks. In this talk, I plan to explain how we wrote our own form of Latent Dirichlet Allocation (LDA) in order to guide topic models to learn topics of specific … more
  • 13 comments
  • Rejected
  • 14 Jun 2019
Session type: Tutorial

archit agarwal

Turning Data into Actionable Insights in Real Time

This talk will share our learnings and best practices in building our data pipeline which is handling billion of events per day and latency in single digit(seconds). how we moved from Spring microservices to Akka framework and how we reduced our VM footprint by 85% using Akka framework and.We have seen a huge growth in data in recent years and using Spring was not scalable.I will share how PayPal… more
  • 3 comments
  • Rejected
  • 15 Jun 2019
Session type: Short talk of 20 mins

Nandan Thakur

Video thumbnail

FlashText – A Python Library 28x faster than Regular Expressions for NLP tasks

Data Science starts with data cleaning. When developers are working with text, they often clean it up first. Sometimes by replacing keywords (“Javascript” with “JavaScript”) while other times, to find out whether a keyword (“JavaScript”) was mentioned in a document. In today’s fast-moving world, bigger and bigger datasets are coming up with tens of thousands to millions of documents. the amount o… more
  • 8 comments
  • Rejected
  • 15 Jun 2019
Session type: Short talk of 20 mins

Kritika Upadhyay

Automatic Accuracy and Compliance

This paper is written for an audience with prior or limited experience on Identity and Access Management, focused more on access provisioning and audit coordinations towards compliance and other regulatory requirements.Access provisioning (APS) is divided into four phases: formation of the APS team, stabilizing the team, automating processes and merging compliance requirements onto the database o… more
  • 1 comment
  • Rejected
  • 15 Jun 2019
Session type: Short talk of 20 mins

Jaidev Deshpande

Video thumbnail

Automating Workflows for AI Projects

As technology gets cheaper and more available, we start taking it for granted. It’s easier than ever before to perform fairly exciting AI tasks with as little as tens of lines of code. As data grows, our approach to ML problems often, and understandably, becomes haphazard. As GPUs become more widely available, we subconsciously think that throwing enough artificial neurons at a problem will event… more
  • 13 comments
  • Rejected
  • 21 Apr 2019
Section: Full talk Technical level: Intermediate Session type: Lecture

Namit Mahuvakar

Analysing high throughput Data in Real Time

##Analysing high throughput Data in Real Time Namit Mahuvakar Data Engineering at Hotstar more
  • 11 comments
  • Confirmed & scheduled
  • 15 Jun 2019
Session type: Full talk of 40 mins Session type: Short talk of 20 mins

Goda Ramkumar

Journey to build Data Driven culture in the Startup Ecosystem - Why, How and What?

The Startup Ecosystem is expanding and bringing innovative ideas to the market. As these startups scale and build products and services that act as sensors to collect huge amount of data, the key question that needs to answered for each one of the startups is “how to make data useful for business?”. The presentation will talk through an approach to start the data driven journey, caveats along the… more
  • 2 comments
  • Rejected
  • 17 Jun 2019
Session type: Short talk of 20 mins

Srihari Srinivasan

Crafting Better Data Pipelines - Some Ideas

The adoption of distributed processing infrastructure heralded a new way of building data processing systems. Shifting to a more generic term, Data Pipelines (over legacy ETL), has helped elevate the architecture of data processing systems from being purely batch oriented to a more hybrid one combining batch, live and real-time elements. With this shift still active, it is imperative that we rais… more
  • 1 comment
  • Rejected
  • 17 Jun 2019
Session type: Full talk of 40 mins

Aniruddha M Godbole

Defining and Solving Data Science for Finance Problems: A Case Study

In this talk the speaker shares his understanding about the challenges of applying Data Science for Finance, takes an example in which he was involved in formulating a challenging problem and where cutting-edge Machine Learning research was used. Finally, the speaker offers his thoughts on how to go about formulating Data Science for Finance problems. more
  • 7 comments
  • Rejected
  • 07 May 2019
Session type: Full talk of 40 mins

raman gupta

How we build highly scalable and multi-tenant orchestration service using Apache Airflow on Kubernetes

We have different use cases which require some sort of workflow management and scheduling.Like there is use case to generate schedule reports. There are ML related use cases to author and manage multi-step workflows. There are ETLs jobs etc.. Currently teams are managing their own scheduler like cron or some workflow manager to meet these use cases. Some teams have also setup Apache Airflow to me… more
  • 3 comments
  • Rejected
  • 30 May 2019
Session type: Short talk of 20 mins

Peter Wang

The Anaconda Journey

The founder of Anaconda talks about the history (and pre-history) of PyData, Anaconda, and the modern Python data science ecosystem. Candid stories about successes and failures along the way, and how the two are often intertwined. more
  • 2 comments
  • Confirmed & scheduled
  • 20 Jun 2019
Session type: Full talk of 40 mins

Peter Wang

State of Data Science & Machine Learning

As machine learning and AI become adopted at an increasing rate, businesses and practitioners face new types of challenges. At the heart of many of these lies an uncomfortable truth: that data science is not merely a new kind of technical specialty, but rather that it represents an opportunity for deep business transformation. In this talk, Peter speaks to this concept that Data Science isn’t jus… more
  • 0 comments
  • Confirmed & scheduled
  • 20 Jun 2019
Session type: Full talk of 40 mins

Ayush mittal

Feed Generation @ShareChat

ShareChat is India’s largest vernacular social network platform built to enable next generation of India’s internet users. ShareChat is available in 14 vernacular languages. At ShareChat our data is fresh, with most users coming online for first time, our primary goal is to server most relevant content to the users at appropriate time. In this talk we will discuss the new challenges these first t… more
  • 3 comments
  • Confirmed & scheduled
  • 24 Jun 2019
Session type: Short talk of 20 mins Session type: Short talk of 20 mins

Sherin Thomas

Video thumbnail

Tutorial: Taking deep learning to production with RedisAI

Taking deep learning models to production, and doing so reliably, is one of the next frontiers of DevOps. This talk introduces RedisAI, a joint effort by [tensor]werk and RedisLabs. RedisAI is a Redis module that adds tensors & graphs as Redis data types, enabling execution of deep learning graphs on the CPU and GPU using multiple backends (PyTorch, TensorFlow, and ONNXRuntime) simultaneously, wh… more
  • 21 comments
  • Confirmed & scheduled
  • 30 May 2019
Section: Full talk Technical level: Intermediate Session type: Demo Session type: Tutorial

Aditya Karnik

Video thumbnail

Machine learning to save lives on the road

Every year over 1.3M people die on roads. In recent years the rates of fatality and collisions have increasingly gone upward, reversing a several decade long downward trend. more
  • 6 comments
  • Rejected
  • 28 Jun 2019
Session type: Full talk of 40 mins Session type: Full talk of 40 mins

sandeep khurana

Demystifying Social Network Analysis (SNA)

The session is aimed at demystifying the world of network analytics by sharing motivating examples from some popular research papers. I will also provide brief theoretical basis of network analysis, introduce to network metrics, tools and resources. In the last section of the session, I will share some recent applications of SNA from public discourse. more
  • 2 comments
  • Confirmed & scheduled
  • 01 Jul 2019
Session type: Full talk of 40 mins Session type: Full talk of 40 mins

Aravind Putrevu

Elasticsearch Workshop: Search and Beyond

Elasticsearch is a great technology and very to get started with. But over a period of time, Elasticsearch can be moulded to support various use cases especially Logging, Metrics, APM. But the core of it is the Search and its APIs. This workshop would improve your understanding on configuring Elasticsearch for production. Also introduce latest features in Elasticsearch, Logstash, Kibana and Beats. more
  • 0 comments
  • Submitted
  • 09 Jul 2019
Session type: Workshop

Karnam Vasudeva Rao

How We Built a ML Model to Predict Proteins for Insecticidal Activity?

To improve the crop plant yield, agriculture companies have successfully adopted development of insect resistant crops by expressing insecticidal (insect killing) proteins in plants. As a leader in Agriculture Biotechnology industry, Bayer tests hundreds of genes every year for insecticidal activity in their proprietary pipeline to develop next generation of insect control solutions. Identificati… more
  • 2 comments
  • Confirmed & scheduled
  • 26 Jun 2019
Session type: Full talk of 40 mins Session type: Short talk of 20 mins

Sandeep Khurana

Demystifying Social Network Analysis (SNA): a tutorial

The session is aimed at demystifying the world of network analytics by sharing motivating examples from some popular research papers. I will also provide brief theoretical basis of network analysis, introduce to network metrics, tools and resources. This tutorial will set the context for my talk which will follow the next day. more
  • 1 comment
  • Confirmed & scheduled
  • 12 Jul 2019
Session type: Tutorial Session type: Tutorial

Shadab Siddiqui

Data Security and startups : Make the ends meet

Data security refers to protective digital privacy measures that are applied to prevent unauthorized access to computers, databases and websites. Data security also protects data from corruption. Many resource-strapped startups gauge their commitment level to security by assessing the financial expense to the company. Instead, the recommendation is to define security spend by a company’s possible… more
  • 2 comments
  • Confirmed & scheduled
  • 12 Jul 2019
Technical level: Intermediate Session type: Lecture Session type: Short talk of 20 mins

Ramanan Balakrishnan

Birds of Feather (BOF) session: Intent classification and personalization

When it comes to developing a comprehensive natural language understanding system, intent classification is one of the first challenges to overcome. Without developing an understanding of the context of a text, it becomes almost impossible to interpret entities that may be recognized in later stages. One of the main reasons intent classification is popular is also its use in achieving personaliza… more
  • 0 comments
  • Confirmed & scheduled
  • 15 Jul 2019
Session type: BOF session of 1 hour Session type: Birds of a Feather session of 1 hour

Namrata Hanspal

BoF on Interpretability of ML Models

Complex machine learning models work very well at prediction and classification tasks but become really hard to interpret. On the other hand simpler models are easier to interpret but less accurate and hence oftentimes we are made to take a call between interpretability and accuracy. more
  • 1 comment
  • Confirmed & scheduled
  • 15 Jul 2019
Session type: BOF session of 1 hour Session type: Birds of a Feather session of 1 hour

Venkateshan

[BoF] Tackling the complex inter-dependent challenges in transport planning and assignment

Topics to be discussed: Variations in the planning/assignment problem formulation and scope. more
  • 0 comments
  • Confirmed & scheduled
  • 15 Jul 2019
Session type: BOF session of 1 hour Session type: Birds of a Feather session of 1 hour

Nitin Gupta

Age of AI Ops

We look at the evolution and rise of AI Ops. AIOps is the technology solution leveraging machine learning and data analytics to help automate how we react to issues in real time across layers of infrastructure and software. more
  • 6 comments
  • Confirmed & scheduled
  • 29 May 2019
Session type: Full talk of 40 mins Session type: Full talk of 40 mins
Venkata Pingali

Venkata Pingali

BoF on ML platforms

On machine learning platforms, journeys in building them, and managing infrastructure for ML platforms more
  • 8 comments
  • Confirmed & scheduled
  • 17 Jul 2019
Session type: BOF session of 1 hour Session type: Birds of a Feather session of 1 hour

Akash Khandelwal

Birds of a Feather: Data driven culture in the startup ecosystem

Learn how data driven culture can be inculcated when starting up. more
  • 0 comments
  • Confirmed & scheduled
  • 17 Jul 2019
Session type: Birds of a Feather session of 1 hour

Ravi Ranjan

BoF: ML Model Management

Data is the new oil and its size is growing exponentially day by day. Most of the companies are leveraging data science capabilities extensively to affect business decisions, perform audits on ML patterns, decode faults in business logic, and more. They run large number of machine learning model to produce results. more
  • 2 comments
  • Confirmed & scheduled
  • 18 Jul 2019
Session type: BOF session of 1 hour Session type: Birds of a Feather session of 1 hour

Swaminathan Padmanabhan

Multi-tenancy in Machine learning (the SaaS perspective)

Given the emergence of several SaaS product companies in India, there’s a lot of recent interest in provisioning ML capabilities over the cloud; and enabling SaaS customers make use of such capabilites through a self-serve model. These SaaS customers should be able to tailor the ML capabilities on-the-fly to suit their needs, e.g. they should be able to adjust confidence thresholds of a virtual c… more
  • 0 comments
  • Confirmed & scheduled
  • 19 Jul 2019
Session type: BOF session of 1 hour Session type: Birds of a Feather session of 1 hour

simrat

Birds of a Feather: ML in production

Most ML effort stagnates at the stage of building ad-hoc models, with only thin layers of customization around them. This is mostly okay, but there are usually no guarantees about elasticity, uptime or even accuracy (since updating models is non-trivial) - all of which are crucial to business. This BoF invites the audience to discuss problems, paradigms and best practices around deploying machine… more
  • 3 comments
  • Confirmed & scheduled
  • 21 Jul 2019
Session type: BOF session of 1 hour Session type: Birds of a Feather session of 1 hour

Kranthi Mitra

Challenges and approaches for instrumenting and cleaning 'real'/ ugly data

Most practicing data scientists have those “bad data days” where you realize the data is corrupt, or not what you assumed the data to be, or labels are not right or even worse. What if we work in a paradigm assuming: “all data is corrupt, some is useful”, while at the same time instrumenting for any data which can be captured? In such a setting, how to go about various day-to-day data cleaning ch… more
  • 0 comments
  • Confirmed & scheduled
  • 22 Jul 2019
Session type: Birds of a Feather session of 1 hour

Amit Kapoor

Unpacking the Learning Paradigms

Struggling to unpack the plethora of learning paradigms in ML? Let us have a dialogue to both understand them better and build a better mental model to explain them to everyone. more
  • 0 comments
  • Confirmed & scheduled
  • 21 Jul 2019
Session type: Birds of a Feather session of 1 hour
Make a submission

Accepting submissions till 15 Jun 2019, 01:00 PM

NIMHANS Convention Centre, Bengaluru

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more