Submissions

Jul 2016

25 Mon

26 Tue

27 Wed

28 Thu 08:30 AM – 06:25 PM IST

29 Fri 08:30 AM – 06:15 PM IST

30 Sat 08:45 AM – 05:00 PM IST

31 Sun 08:15 AM – 06:00 PM IST

Make a submission

NIMHANS Convention Centre

The Fifth Elephant is India’s most renowned data science conference. It is a space for discussing some of the most cutting edge developments in the fields of machine learning, data science and technology that powers data collection and analysis.

Machine Learning, Distributed and Parallel Computing, and High-performance Computing continue to be the themes for this year’s edition of Fifth Elephant.

We are now accepting submissions for our next edition which will take place in Bangalore 28-29 July 2016.

#Tracks

We are looking for application level and tool-centric talks and tutorials on the following topics:

Deep Learning
Text Mining
Computer Vision
Social Network Analysis
Large-scale Machine Learning (ML)
Internet of Things (IoT)
Computational Biology
ML in healthcare
ML in education
ML in energy and ecology
ML in agriculrure
Analytics for emerging markets
ML in e-governance
ML in smart cities
ML in defense

The deadline for submitting proposals is 30th April 2016

Format

This year’s edition spans two days of hands-on workshops and conference. We are inviting proposals for:

Full-length 40 minute talks.
Crisp 15-minute talks.
Sponsored sessions, 15 minute duration (limited slots available; subject to editorial scrutiny and approval).
Hands-on Workshop sessions, 3 and 6 hour duration.

Selection process

Proposals will be filtered and shortlisted by an Editorial Panel. We urge you to add links to videos / slide decks when submitting proposals. This will help us understand your past speaking experience. Blurbs or blog posts covering the relevance of a particular problem statement and how it is tackled will help the Editorial Panel better judge your proposals.

We expect you to submit an outline of your proposed talk – either in the form of a mind map or a text document or draft slides within two weeks of submitting your proposal.

We will notify you about the status of your proposal within three weeks of submission.

Selected speakers must participate in one-two rounds of rehearsals before the conference. This is mandatory and helps you to prepare well for the conference.

There is only one speaker per session. Entry is free for selected speakers. As our budget is limited, we will prefer speakers from locations closer home, but will do our best to cover for anyone exceptional. HasGeek will provide a grant to cover part of your travel and accommodation in Bangalore. Grants are limited and made available to speakers delivering full sessions (40 minutes or longer).

Commitment to open source

HasGeek believes in open source as the binding force of our community. If you are describing a codebase for developers to work with, we’d like it to be available under a permissive open source licence. If your software is commercially licensed or available under a combination of commercial and restrictive open source licences (such as the various forms of the GPL), please consider picking up a sponsorship. We recognise that there are valid reasons for commercial licensing, but ask that you support us in return for giving you an audience. Your session will be marked on the schedule as a sponsored session.

Key dates and deadlines

Revised paper submission deadline: 17 June 2016
Confirmed talks announcement (in batches): 13 June 2016
Schedule announcement: 30 June 2016
Conference dates: 28-29 July 2016

##Venue
The Fifth Elephant will be held at the NIMHANS Convention Centre, Dairy Circle, Bangalore.

##Contact
For more information about speaking proposals, tickets and sponsorships, contact info@hasgeek.com or call +91-7676332020.

Hosted by

The Fifth Elephant

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Accepting submissions

Not accepting submissions

Let your Big Data Processing take flight with Apache Falcon

At InMobi, a mobile advertising company, we see events arriving in excess of 10 billion per day. Analysis, reporting and inferencing from these requests (and responses served) is key to serving the right ad, to the right person, at the right time. We have nearly 200 complex big data pipelines that run against various data sources. Managing so many pipelines and the associated data was becoming a … more

0 comments
Confirmed & scheduled
25 Feb 2016

Section: Crisp talk Technical level: Beginner

Real-time Ingestion of logs into Hive with a low latency, to query and respond to events

Threat landscape is changing very rapidly and we are seeing more and more targeted attacks. Detecting such attacks requires a data driven approach, which requires processing PBs of telemetry data (AV detections, system access logs, network statistics etc.) received from end points, firewalls, gateways etc. more

0 comments
Cancelled
14 Mar 2016

Section: Crisp talk Technical level: Intermediate

Long Running Services on YARN: Future of Service Deployment & Management via Hadoop

YARN has long aspired to be an operating system for the data center. In order to bring that promise to fruition, it must be able to host services that transcend the usual provision-execute-teardown lifecycle of most Hadoop processing frameworks. In this talk, we will share what we’ve learned, building long running services together with on-demand scaling and monitoring on YARN. We will first disc… more

12 comments
Submitted
14 Mar 2016

Technical level: Advanced

Smart Energy

Smart Energy Management is to collect the data from various sensors ( end points) using open source frame works of IoT/IoE and anlayse the usage patterns using machine learning alogorthems and dynamically set the policies to optimize the energy resources. more

0 comments
Shortlisted
15 Mar 2016

Section: Crisp talk Technical level: Intermediate

Timely Dataflow

Many data processing tasks require low-latency interactive access to results, iterative sub-computations, and consistent intermediate outputs so that sub-computations can be nested and composed. Timely Dataflow is the computational model that addresses these challenges as an unified systems as suppose to bolting batch & stream processing system together. It is first presented as part of Naiad (SO… more

0 comments
Confirmed & scheduled
22 Mar 2016

Section: Crisp talk Technical level: Advanced

Increasing Trust and Efficiency of Data Science using dataset versioning

As data science grows and matures as a domain, harder questions are being asked by decision makers about trust and efficiency of data science process. Some of them include: more

5 comments
Confirmed & scheduled
27 Mar 2016

Section: Crisp talk Technical level: Intermediate

Design Patterns in IoT/IoE

In this talk, I will share some of the design patterns which we have implemented in Smart Buildings/Smart Cities and Smart Anlaytics solutions. more

0 comments
Shortlisted
30 Mar 2016

Section: Crisp talk Technical level: Intermediate

Emerging patterns of lifestyle impact on personal health & wellness

Lifestyle is changing at a very rapid pace as we enter the internet era. Pace of evolution in terms of technology, lifestyle, work environment, etc. is more rapid than ever before and has resulted in how our lifestyle and health has changed. To be able to understand the new health and wellness patterns emerging, and help a preventive health care based start-up design improved solutions to help pe… more

1 comment
Waitlisted
10 Apr 2016

Section: Crisp talk Technical level: Beginner

Model Visualisation

Though visualisation is used in data science to understand the shape of the data (data-vis), it is not widely used for the models developed; which are largely evaluated based on numerical summaries. Model visualisation (model-vis) can help understand: the shape of the model, the impact of parameters & different input data on the model, the fit of the model & where it can be improved. more

0 comments
Confirmed & scheduled
13 Apr 2016

Section: Full talk Technical level: Beginner

What do machine learning and high performance computing have to do with big cats in the wild?

Science has played a crucial role in our understanding of big cats in the wild and in their conservation. When we focus on the aspect of “gaining knowledge” or “learning”, few other approaches have done better than rigorous application of scientific methods. As we all know too well, the scientific method involves careful observation, construction of relevant theories and confronting these theorie… more

0 comments
Confirmed & scheduled
15 Apr 2016

Section: Full talk Technical level: Intermediate

Statistical Models for Better Customer Engagement

We look at the various stages of a sales/marketing funnel, and see how data science can be used to improve effectiveness of the processes, understand what the customer wants, and discover new ways of engagement in each stage. We discuss the statistical models, the business metrics they drive, and share real life examples from our experience. more

0 comments
Submitted
21 Apr 2016

Technical level: Intermediate

Big Data Structures

Analysis of terabyte data sets by heavy data processing are common tasks these days. A data structure is a particular way of organizing data in a computer so that it can be used efficiently. For Big Data, the computer changes to a cluster and also the way of organizing the data is distributed. The usage patterns are changing from being precise changes to being probabilistic. False positive matche… more

3 comments
Waitlisted
24 Apr 2016

Section: Full talk Technical level: Beginner

Unified & Distributed Test Infrastructure at Scale (Hortonworks Data Platform Testing)

Extensive software testing is required before the actual release to ensure the software quality and the software has to perform equally well in every platform and combination of configurations. When it comes to a data platform, the testing is even more complicated due to variety of clusters, storage layers, operating systems, JDK versions, data base flavors, execution engine, security config, com… more

0 comments
Rejected
24 Apr 2016

Section: Crisp talk Technical level: Intermediate

Taking Fashion and Lifestyle Commerce Towards SKUs Using Deep Image and Text Parsing

In this talk, I will describe challenges, insights, innovations and experiences in building a large-scale deep learning system to prepare SKUs (Stock Keeping Units) for millions of fashion products. more

0 comments
Confirmed & scheduled
25 Apr 2016

Section: Full talk Technical level: Intermediate

Dr. Elephant - Self-Serve Performance Tuning for Hadoop and Spark

Hadoop is a framework that facilitates the distributed storage and processing of large distributed datasets involving a number of components interacting with each other. Because of its large and complex framework, it is important to make sure every component performs optimally. While we can always optimize the underlying hardware resources, network infrastructure, OS, and other components of the … more

1 comment
Confirmed & scheduled
25 Apr 2016

Section: Crisp talk Technical level: Intermediate

Designing Data Products

Coming up with a good model is very important for any machine learning system. But to build a good data product, there are a bunch of other things that goes along with the model. The focus of this talk will be to discuss those things and share our learnings and recommendations based on our experience. more

1 comment
Cancelled
26 Apr 2016

Technical level: Intermediate

Apache Storm past, present and future

Apache Storm is one of the most mature and widely adopted real-time data platforms available. In this session we look at how Storm has evolved over the years, take an in-depth look at the new features that were added in the recently released Apache Storm 1.0 and how some of those features can be used to solve common streaming and IOT use cases. more

0 comments
Shortlisted
26 Apr 2016

Technical level: Intermediate

(Workshop) Understanding neural networks by building few from scratch

I have a firm belief that, there’s elegant and understandable theory behind neural networks. more

0 comments
Rejected
27 Apr 2016

Section: Workshop Technical level: Intermediate

Visually reading the configuration of a Rubiks cube using Probabilistic Graphical Model

Identify the edges in the field of view and then correlate the sequence of frames to infer the configuration of the rubiks cube. The audience will be able to take away as to how one can correlate information from video frames to infer the kinematics of the object in the field of view more

0 comments
Rejected
27 Apr 2016

Technical level: Intermediate

Forecasting the degradation of Network KPIs

In this talk, We present a methodology to predict network degradation in the telecom sector. We will be explaining how to forecast degradation of network key performance indicators (KPIs) and providing (24 Hrs. in advance) alerts to network operations team to take preemptive actions before degradation affects network performance more

3 comments
Waitlisted
28 Apr 2016

Section: Crisp talk Technical level: Intermediate

Machine Learning - Democratized

Machine Learning is no more a science for data scientists and data engineers, the cloud based machine learning services have democratized the entire process of Machine learning, right from the Data science to the data engineers to the data visualization. You no longer need to be an expert in either to take a taste of Machine learning or see how it works. The cloud based ML options even allow you … more

0 comments
Rejected
28 Apr 2016

Section: Full talk Technical level: Beginner

Purpose, Speed & Visibility : Facilitating product discovery & engagement on a e-commerce website

Each product on an ecommerce website has an opportunity to sell and market dynamics determines what’s selling and at what speed . This has Merchandising implications for stock re-fill, flash sales, promotions & special events - along with the actions a merchant’s platform team takes in anticipation for such events. By reverse engineering this quantitatively, and tuning the proprietary Search rank… more

0 comments
Confirmed & scheduled
29 Apr 2016

Section: Full talk Technical level: Intermediate

Artificial Intelligence for Efficient Financial Markets

Artificial Intelligence (AI)! This is not just the name of the 2001 Spielberg movie! It is also the field of study to create machines capable of intelligent behavior. more

0 comments
Rejected
29 Apr 2016

Section: Crisp talk Technical level: Intermediate

Discovering App Relationships in Smart Phones through Large Scale Mining of User Journey Data

User experience while navigating through home screen and apps is a key differentiator for any smart phone. Building a user interface giving ease of use and personalized and contextualized home screen requires deep understanding of how different users are using their phones. Mobile OEMs periodically collect application usage data from millions of smart phone users. Analyzing this massive amount of… more

1 comment
Submitted
29 Apr 2016

Section: Full talk Technical level: Intermediate

Interactive data transformations at scale

One set of ETL tools allows building ETL pipelines for large datasets, however these tools do not provide data-level interactivity. There’s another set of data-prep tools that allow interactive data transformations, however only for a single table (or for datasets that can fit in the memory of a single machine). The challenge is to provide the best of both worlds - interactive data transformation… more

0 comments
Cancelled
29 Apr 2016

Section: Sponsored Technical level: Beginner

High performance computing using Spark

Spark has revolutionized the way Big data computation are done. It provides efficient way of distributed data processing computation. In this session, I will cover our experience of implementing a large scale big data platform (> 100 TB) using Spark and challenges faced/lessons learned more

1 comment
Submitted
29 Apr 2016

Technical level: Intermediate

Security Analytics at Web Scale

• What is Security Analytics • How Symantec discovers risks and weaknesses in Enterprises more

0 comments
Rejected
29 Apr 2016

Section: Full talk Technical level: Intermediate

Logging at scale using Graylog - Billion+ messages, 100K req/sec

With the advent of micro-services, dozens of releases per day, logs are the bread and butter for a successful real-time technology platform like OlaCabs. In this talk, I would be presenting our logging pipeline and the challenges we faced while doing it at Ola scale. more

2 comments
Confirmed & scheduled
30 Apr 2016

Section: Crisp talk Technical level: Intermediate

Machine Learning Application in MicroFinance

Artoo is a loan origination system (LOS), our aim is to improve the financial inclusion in world (starting with India). As a testament to our mission we have help disbursed 1 Lac loans worth 1,000 crores (last two years), we wanted to share our experience of using data and eventually data science in helping our clients take the right call while disbursing loans. more

0 comments
Rejected
30 Apr 2016

Section: Crisp talk Technical level: Beginner

Sensor Analytics for IoT and Embedded Systems

Analytics-driven embedded systems are here! We’ll show this in action by classifying human activity in real-time using sensor data from a smartphone accelerometer. The demo will show a complete workflow: – pre-processing with digital filtering and frequency analysis, – exploring different classification algorithms (such as decision trees, support vector machines, or neural networks), and – automa… more

0 comments
Rejected
30 Apr 2016

Section: Crisp talk Technical level: Intermediate

Data-Driven Decision Making in Indian Agriculture: the Present and the Future

Data-driven decision making is critical in sectors like agriculture, health, and education where well-planned initiatives have the power to literally change lives. Lack of a consolidated platform with access to relevant data, however, hinders objectivity and efficiency in the decision making process for the decisions that matter most. In this session, we reveal how we integrated relevant data — p… more

0 comments
Confirmed & scheduled
30 Apr 2016

Section: Crisp talk Technical level: Intermediate

Knowledge Inference: Estimating how much the student knows

Very high student-teacher ratios, lack of infrastructure and other socio-economic issues have affected quality and accessibility of education significantly. Moreover, Education can also benefit from the potential and promises of technology (particularly AI), which has already transformed our lives in many aspects. An Intelligent Tutoring System (ITS) is a computer system which enables learning in… more

0 comments
Rejected
30 Apr 2016

Section: Full talk Technical level: Intermediate

Building a large scale fully automatic machine learning platform from scratch

Data science is hard, expensive and needs a combination of math, statistics and software engineering skills. Mass adoption of data science is only possible if self-service machine learning platforms are built. We have built Insight Jedi, the first fully automatic machine learning platform that automates the complete data-to-decisions workflow covering data cleanup, feature generation, feature fil… more

0 comments
Shortlisted
30 Apr 2016

Section: Full talk Technical level: Advanced

Stream in a Flink way

Apache Flink is a distributed stream and batch processing engine. It gives you high throughput and low latency. It supports for event time, out of order events, streaming windows, exactly one semantics, fault tolerance and many other cool features. It also has broad integration with many open source projects. more

1 comment
Shortlisted
30 Apr 2016

Section: Full talk Technical level: Intermediate

Reducing the world with JavaScript

The Earth is a staggering dataset. OpenStreetMap is the largest living open map of the world with a collection of over 1B mapped roads and ~2B mapped buildings. Processing this massive dataset can lead to a lot of interesting analyses about the world, but can also be really slow - enter the open source TileReduce module. more

1 comment
Confirmed & scheduled
30 Apr 2016

Section: Full talk Technical level: Intermediate

A large scale IOT platform architecture using open source apache projects like Nifi, Kafka, Storm, Spark and Hadoop.

Gartner predicts there will be 26 billion devices on the Internet of Things by 2020. Capturing and analyzing data from connected devices provides a wealth of opportunity. In this session we will look at how open source Apache projects like Apache NiFi, Kafka, Storm, Spark and Hadoop can work in concert to analyze and provide insights in a large scale distributed IoT architecture. more

2 comments
Shortlisted
30 Apr 2016

Section: Full talk Technical level: Intermediate

Predicting Corporate Bankruptcy by mining financial reports and regular transactional trends combining with Investor sentiment analysis

Bankruptcy is one of the major concern for any type of market. If any company fall and loses money it’s a damage to a part of economic environment. Prediction of Bankruptcy has become important with time as it helps in mitigating risk by the organization as well as the current standing government. This short talk will walk you through how Machine Learning is changing the world of finance especial… more

7 comments
Rejected
30 Apr 2016

Section: Crisp talk Technical level: Intermediate

Sentiment analysis to evaluate the performance of Fund Managers

Global Assets under Management (AUM) is estimated to be 64 Trillion USD across the globe. Investment Managers are the key players in this business who make investment decisions on behalf of investors. What are the tools the financial services companies have to evaluate the performance of these managers? There are tremendous amount of data available for the underlying financial instrument, be it m… more

0 comments
Cancelled
30 Apr 2016

Section: Crisp talk Technical level: Beginner

Apache Drill - Optimising Time to market

Data is more than doubling up every year. With semi-structured data growing at a much larger pace than structured data and data flowing from different sources having different data types, much of one’s time is wasted in defining schemas and transformations. Often, the schemas are unknown upfront, as datasets are evolving in highly dynamic ways. And current systems are unable to let us query dynam… more

0 comments
Waitlisted
30 Apr 2016

Section: Crisp talk Technical level: Intermediate

ML in fin-tech - Transforming 60 crore Indian lives

I lead Finomena, which uses the power of big-data, AI and ML in every imaginable way (information retrieval, NLP, deep learning, social network analysis, fraud detection and prevention, image recognition (even from videos), speech to text transcription and analysis, reinforcement learning) on a daily basis to provide access to credit to people in the long tail in India - over 60 crore people who … more

0 comments
Confirmed & scheduled
30 Apr 2016

Section: Full talk Technical level: Beginner

Data pipelines - Cakewalk with Docker and Luigi

Modern data driven products are powered by pipelines of data processing tasks. Building this infrastructure requires a lot of boiler plate code. Moreover deploying these tasks consistently accross development to production environment, and maintaining resource isolation can cause longer development cycles. Maintaing different versions of datasets and tracking improvement of your model on these ve… more

1 comment
Submitted
30 Apr 2016

Technical level: Advanced

Recommender Engines : A Peak into Predictive Analytics

The growth of data at exponential rates isn’t news today. Social media and e-commerce platforms are major contibuting factors to this story. With billions of users online, the potential for marketing and reach is immense. Recommender engines are utilized across domains to assist users make the right choices by understanding their behaviour and tastes. more

2 comments
Submitted
30 Apr 2016

Section: Full talk Technical level: Beginner

Challenges in Data Warehouse Augmentation on Hadoop

Enterprises these days are finding value in moving their traditional data warehouses into augmented and historical data stores on Hadoop. This requires continuous data synchronisation between traditional data warehouses and data on Hadoop. It is also added advantage to maintain slow changing dimensions of data when it is ingested onto Hadoop from traditional database systems. Once this data is av… more

0 comments
Rejected
01 May 2016

Section: Sponsored Technical level: Intermediate

Four horsemen of the IoT

MQTT brokers have been around for quite a bit. But never before has there been so much active development for IoT cloud providers. Silicon is cheaper than ever. IoT, especially industrial, is now feasible for even small and medium sized enterprises with lower margins. more

0 comments
Rejected
01 May 2016

Section: Full talk Technical level: Intermediate

An Approach for recommending TopK Digital Artworks

We have shown how recommender systems apply to the online digital artwork domain. The goal was to test the ability of recommender systems to aid artists in discovering artwork relevant to their likings. The users were from the online digital artwork sharing community, using the PENUP application. We have used information retrieval based metrics to measure the performance of a few key algorithms i… more

0 comments
Shortlisted
02 May 2016

Section: Crisp talk Technical level: Beginner

Anti-patterns in designing machine learning systems

The talk will focus on ML specific challenges to designing data science systems, how such systems acquire technical debt, and what we can do at design level to mitigate some of the risks. more

0 comments
Shortlisted
02 May 2016

Technical level: Advanced

Exploit conceptual data models using ontology modeling

We will introduce the audience to a different way of modeling data. And demonstrate creating an Ontology model using structured and unstructured content. more

0 comments
Submitted
02 May 2016

Section: Crisp talk Technical level: Beginner

Continuous online learning for classification tasks

At Airwoot (now acquired by Freshdesk), we model NLP-based margin-based classifiers to filter spam from relevant customer tweets/post on social media. We work with the language of social, and this introduces a challenge of continuously adapting our models to the change in social verbiage. The language of social is dynamic with new hashtags, acronyms and induced spelling mistakes forcing us to upd… more

0 comments
Confirmed & scheduled
07 Jun 2016

Section: Full talk Technical level: Intermediate

Data Simulation as a means to intuitively grasp Statistics and it's direct application to prediction problems

Whenever there is data, there is meta-data about the data itself characterised in the form of Statistics. more

0 comments
Rejected
07 Jun 2016

Section: Full talk Technical level: Beginner

Introduction to Statistics and Basics of Mathematics for Data Science - the hacker's way

A lot many of us decided Math was our reckoning in our high school and ended up studying highly quantitative fields like engineering and computer science and some of us even specialized further with a Masters, including MBA. And yet here we are, a few years into our career and suddenly realizing the math basics isn’t as strong as what we thought it should have been. more

2 comments
Confirmed & scheduled
07 Jun 2016

Section: Workshop Technical level: Beginner

Leveraging Streaming Systems for Machine Learning

Larger datasets lead to better quality of Prediction models. However experimenting with larger datasets in a test environment to test the accuracy of the model is not always feasible, primarily due to limited resources like limited main memory, lack of CPU power, etc. This talk will highlight how such experiments can be conducted on small nodes (like a modern laptop) by leveraging streaming syste… more

0 comments
Cancelled
09 Jun 2016

Section: Crisp talk Technical level: Intermediate

RNNs for multimodal information fusion

Data generated from real world events are usually temporal and contain multimodal information such as audio, visual, depth, sensor etc. which are required to be intelligently combined for classification tasks. I will discuss a novel generalized deep neural network architecture where temporal streams from multiple modalities can be combined. The hybrid Recurrent Neural Network (RNN) exploits the c… more

0 comments
Cancelled
09 Jun 2016

Section: Crisp talk Technical level: Intermediate

Distributed Computing Abstractions for Big Data Science

The data science field has made significant advances in the last few years, with a renewed focus on getting data science to work at scale. The talk shall outline distributed computing abstractions required to realize data science at scale. The Resilient Distributed DataSet (RDD) abstraction provided by Spark is becoming a de-facto approach for big data science. However, Apache Flink and recently,… more

0 comments
Rejected
09 Jun 2016

Section: Full talk Technical level: Intermediate

Don’t just build a data lake, build data powerhouse.

Companies are now trying to become data oriented and trying to take decision based on data. more

0 comments
Rejected
13 Jun 2016

Section: Full talk Technical level: Intermediate

Distributed change data capture platform

The speed of today’s processing systems have moved from classical data warehousing batch reporting to the real-time processing and analytics. RDBMS (OLTP) data is one such type of data required for analysis and deriving business insights. Traditional way of ingesting RDBMS data into analytical system (hadoop etc.) is via bulk import or query based ingestion. This approach has following issues more

1 comment
Submitted
14 Jun 2016

Section: Full talk Technical level: Intermediate

Intuit’s Data journey to Public cloud

Cloud adoption has now entered the “early mainstream” stage as enterprises increasingly look to cloud deployment as a viable model for agile, cost-effective IT delivery. However, the prevailing binary paradigm of cloud infrastructure (public versus private) limits the extent to which enterprises can fully leverage the on-demand, self-service, elastic resource provisioning attributes of public clo… more

0 comments
Rejected
14 Jun 2016

Section: Crisp talk Technical level: Intermediate

How Intuit solved big scan problem in real time

Intuit provides business and financial management solutions for small and mid-sized businesses, financial institutions, consumers and accounting professionals. These products span several categories, including accounting, payroll, payments, tax. Since the business transactions involve Intuit and non-Intuit users of these products, we need a clear identity of the user/business across the offerings… more

1 comment
Waitlisted
14 Jun 2016

Section: Crisp talk Technical level: Beginner

Building a scalable Data Science Platform ( Luigi, Apache Spark, Pandas, Flask)

“In theory, there is no difference between theory and practice. But in practice, there is.” - Yogi Berra more

0 comments
Confirmed & scheduled
14 Jun 2016

Section: Workshop Technical level: Intermediate

Building a Large scale Augmented classifier ensemble to predict in noisy data

Different types of classifiers were investigated in the context of classification of problem tickets in the Enterprise domain. There were still challenges in building an accurate classifier post data cleaning and other accuracy improving pre-processing techniques. Creating an ensemble of classifiers gave better accuracy than individual classifiers. The maximum accuracy was got by enhancing the en… more

0 comments
Rejected
15 Jun 2016

Section: Full talk Technical level: Advanced

RightFit- A Data Science Approach to Reduce Product Returns in Fashion e-Commerce

Fashion e-commerce industries experience a lot of product returns (or exchange) from customers. Most of these are attributed to incorrect size (or fitment). The talk will focus on this problem and present a solution to reduce such returns. Specifically, we present a data science driven approach to profile our customers based on their past purchases and returns and use that to recommend the right … more

2 comments
Confirmed & scheduled
15 Jun 2016

Section: Crisp talk Technical level: Intermediate

Bootstrapping inspired by Hacking Human Cognition

Several applications of Machine Learning are hamstrung by the a vicious cycle. more

0 comments
Rejected
17 Jun 2016

Section: Crisp talk Technical level: Intermediate

Looking under the hood - demystifying data tools

The goal of this talk is to help build an understanding of the performances of the following packages - R Dataframe R data.table Pandas Numpy PySpark RDDs PySpark Dataframes RedShift While these packages are operating in different but intersecting realms of use cases, depending on the cardinality of the data and the operations that will be performed on it, some are more suited than others for the… more

2 comments
Confirmed & scheduled
17 Jun 2016

Section: Crisp talk Technical level: Intermediate

Deep Learning for Computer Vision

One of the fields that have benefited the most from the rise of Deep Learning has been Computer Vision. The goal of this workshop is to have participants go from the basics to tackling a problem that might solve a real world problem. more

0 comments
Confirmed & scheduled
23 Jun 2016

Section: Workshop Technical level: Intermediate

Scalable Realtime Analytics using Druid

Traditional SaaS solutions based on hadoop datastore Hive/Hbase or classical RDBMS work well for storing data, although they are not optimized for ingesting data and making it immediately available for interactive ad-hoc low latency queries at a very high scale. Long query latencies make these solutions suboptimal choices to power interactive applications. This talk will introduce Druid as a comp… more

2 comments
Confirmed & scheduled
06 Jul 2016

Section: Full talk Technical level: Intermediate

Advanced Deep Learning Workshop – Hands-on

Deep Learning is a hot topic, but has a steep initial learning curve. This workshop is aimed at giving participants ‘hands-on’ experience of a range of deep learning techniques. more

0 comments
Confirmed & scheduled
07 Jul 2016

Section: Workshop Technical level: Advanced

Convolutional Neural Networks from the Other Side

Deep Learning has made lot of progress in the last four years: more

0 comments
Confirmed & scheduled
09 Jul 2016

Section: Full talk Technical level: Advanced

The Alternative Data revolution on Wall St

This talk will focus on the role that non-traditional data research, known as alternative data, is beginning to play across the investment community. We will address how datasets such as point of sale transactions, web site usage, municipality records, social media data and similar information are being utilized by traditional long-short funds, quantitative hedge funds and also mutual funds. more

0 comments
Confirmed & scheduled
11 Jul 2016

Section: Full talk Technical level: Intermediate

Taking Analytics Applications from Labs to the Real World: Transfer Learning in Practice

Traditional supervised learning models’ performances degrade if “nature” of test samples differ from that of training samples. For example, a classifier built to discriminate between “books” with positive, negative and neutral reviews when applied to discriminate between “kitchen products” into the same set categories, its performance drops. This relates to one of the fundamental probably approxi… more

0 comments
Confirmed & scheduled
11 Jul 2016

Section: Full talk Technical level: Intermediate

Machine Learning the Walmart Way with a Deep Dive into Automated Forecasting System

Walmart, the largest retailer also has one of the largest data, with petabytes of data created every day. The world is moving to a more data driven decision making ecosystem and building machines that can take those decision. Hence effective management of the data and analysis in a human independent manner is the need of the hour. more

1 comment
Confirmed & scheduled
11 Jul 2016

Section: Crisp talk Technical level: Intermediate

Lessons Learned : Real-life NLP

Building a practical Natural Language Processing system goes far beyond installing an open source toolkit. I will give an overview of some of the components required, and obstacles that have to be overcome for a system that extracts entities and relationships from full-text documents. more

0 comments
Confirmed & scheduled
12 Jul 2016

Section: Crisp talk Technical level: Intermediate

Meet the needs of content marketing with the power of NLP

Content Marketing is one of the recent buzz in the space of digital marketing. Content Marketing broadly refers to focusing on providing quality and useful content to customers for engaging and attracting customers towards a brand. With the proliferation of channels where these content can potentially be delivered, there is an increasing demand from content writers to provide content that can be … more

0 comments
Confirmed & scheduled
13 Jul 2016

Section: Full talk Technical level: Intermediate

Hadoop & Cloud Storage: Object Store Integration in Production

Today’s typical Apache Hadoop deployments use HDFS for persistent, fault-tolerant storage of big data files. However, recent emerging architectural patterns increasingly rely on cloud object storage such as S3, Azure Blob Store, GCS, which are designed for cost-efficiency, scalability and geographic distribution. Hadoop supports pluggable file system implementations to enable integration with the… more

0 comments
Confirmed & scheduled
15 Jul 2016

Section: Crisp talk Technical level: Intermediate

Deciphering Driving Behaviour using Geospatial Temporal Data Collected from Smartphone Sensors

Our vision at Zendrive Technologies is ‘Safer Drivers, Safer Roads’. To that end, we collect data from a variety of sensors available on smartphones, and combining techniques from signal processing, statistical modeling and geographical information systems (GIS) we detect events pertaining to driving and characterize one’s driving style. more

0 comments
Confirmed & scheduled
18 Jul 2016

Section: Full talk Technical level: Intermediate

Hierarchical Bayes Approach and Implementation of MCMC in an Ecological Study

The Bayesian paradigm for analysing data has gained unmatched popularity at most of the fields of statistical application in the late twentieth century. Bayesian methods permits one to construct statistical models by simultaneously using the current data and all the prior information on hand to make inference about the unknown nature of the underlying process, in a marvellously simple way. But th… more

0 comments
Confirmed & scheduled
18 Jul 2016

Section: Full talk Technical level: Advanced

Real Time Fulfilment Planning at Flipkart Scale

Flipkart.com stores and sells millions of unique items through its Fulfillment Centers (FCs) and Sellers. These items need to be picked from FCs or need to be shipped from tens of thousands of Sellers into the many Sortation Centres in the Flipkart network. We need different quantities of each of these items, we need to pick them up from the FCs or Sellers at different times, and bring it into th… more

0 comments
Confirmed & scheduled
19 Jul 2016

Section: Full talk Technical level: Intermediate

Allocation and Forecasting in Guaranteed Delivery of Advertisements

Guaranteed delivery (GD) of advertisements helps brands book advertisement views of niche audience segments well in advance. To enable this, we need to create an intelligent system which allows for targeting of users, forecasting supply, optimally booking campaigns, allocating campaigns to users, pricing the guarantees and penalties correctly. more

0 comments
Confirmed & scheduled
19 Jul 2016

Section: Full talk Technical level: Intermediate

Scaling the Largest Functional DataSet @Flipkart aka Catalog

Catalog refers to the product pivoted information. This Functional data can often be non-trivial to manage and serve, especially when it is constantly evolving. Managing the flux of incoming updates, keeping timestamp consistent data views to entities & their associations and serving it to clients are the main challenges. This talk tries to take us through the journey of scaling platform to serve… more

0 comments
Confirmed & scheduled
19 Jul 2016

Section: Full talk Technical level: Intermediate

Reasoning: The Next Frontier in Data Science

The “Prediction Paradigm” in data science has come a long way. Today, we can build reasonably accurate models for complex prediction problems such as detecting objects in Images, answering Jeopardy questions, translating documents from one language to another, or recognising people from face images. more

1 comment
Confirmed & scheduled
21 Jul 2016

Section: Full talk Technical level: Intermediate

Using Data to Identify the Genomic Cause of Disease

A number of diseases, including cancer, are caused by genomic mutations. The task of identifying the causative mutation requires sequencing the genome and then analysing the large amount of data that results. What follows can often be confounding in various ways as this talk will illustrate with real examples -- infants who pass away mysteriously, siblings with misplaced organs, a little boy suff… more

0 comments
Confirmed & scheduled
21 Jul 2016

Section: Full talk Technical level: Intermediate

Jul 2016

25 Mon

26 Tue

27 Wed

28 Thu 08:30 AM – 06:25 PM IST

29 Fri 08:30 AM – 06:15 PM IST

30 Sat 08:45 AM – 05:00 PM IST

31 Sun 08:15 AM – 06:00 PM IST

Make a submission

NIMHANS Convention Centre

Hosted by

The Fifth Elephant