Submissions

The Fifth Elephant 2015

A conference on data, machine learning, and distributed and parallel computing

Machine Learning, Distributed and Parallel Computing, and High-performance Computing are the themes for this year’s edition of Fifth Elephant.

The deadline for submitting a proposal is 15th June 2015

We are looking for talks and workshops from academics and practitioners who are in the business of making sense of data, big and small.

Track 1: Discovering Insights and Driving Decisions

This track is about general, novel, fundamental, and advanced techniques for making sense of data and driving decisions from data. This could encompass applications of the following ML paradigms:

  • Statistical Visualizations
  • Unsupervised Learning
  • Supervised Learning
  • Semi-Supervised Learning
  • Active Learning
  • Reinforcement Learning
  • Monte-carlo techniques and probabilistic programming
  • Deep Learning

Across various data modalities including multi-variate, text, speech, time series, images, video, transactions, etc.

Track 2: Speed at Scale

This track is about tools and processes for collecting, indexing, and processing vast amounts of data. The theme includes:

  • Distributed and Parallel Computing
  • Real Time Analytics and Stream Processing
  • MapReduce and Graph Computing frameworks
  • Kafka, Spark, Hadoop, MPI
  • Stories of parallelizing sequential programs
  • Cost/Security/Disaster Management of Data

Commitment to Open Source

HasGeek believes in open source as the binding force of our community. If you are describing a codebase for developers to work with, we’d like it to be available under a permissive open source license. If your software is commercially licensed or available under a combination of commercial and restrictive open source licenses (such as the various forms of the GPL), please consider picking up a sponsorship. We recognize that there are valid reasons for commercial licensing, but ask that you support us in return for giving you an audience. Your session will be marked on the schedule as a sponsored session.

Workshops

If you are interested in conducting a hands-on session on any of the topics falling under the themes of the two tracks described above, please submit a proposal under the workshops section. We also need you to tell us about your past experience in teaching and/or conducting workshops.

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Accepting submissions

Not accepting submissions

Rudraksh MK

Tackling ML's black boxes with probabilistic programming

While machine learning has become a wildly popular solution for analyzing a lot of problems, it’s also ended up becoming a major black box. The objective of this talk is to showcase probabilistic programming as a feasible alternative in such scenarios. more
  • 0 comments
  • Rejected
  • 18 Apr 2015
Section: Full Talk Technical level: Advanced

Dr. Jai Ganesh

Networks and Network Analysis

This talk will cover various issues related to Networks and the ways to leverage Social Network Analysis techniques to gather inferences and insights from them. more
  • 0 comments
  • Rejected
  • 27 Apr 2015
Section: Full Talk Technical level: Advanced

Madhukara Phatak

Big data analysis with Apache Spark

Apache Spark is a new upcoming big data processing engine. It’s getting popular for it’s of ease of use and it’s unification of different big data work load. The objective this workshop is to get your hands dirty with it. more
  • 0 comments
  • Rejected
  • 06 May 2015
Section: Workshop Technical level: Beginner

Madhukara Phatak

Anatomy of RDD : A Deep dive into Spark RDD Data structure.

RDD is the core abstraction of Apache Spark. So understanding RDD in depth is very crucial to use spark very effectively. This talks aims to take audience a deep dive into RDD to make them understand why it’s so powerful. more
  • 1 comment
  • Rejected
  • 06 May 2015
Section: Full Talk Technical level: Advanced

Rahul Kavale

Scrap Your MapReduce - Introduction to Apache Spark

Introduction to Apache Spark, compare and contrast it with MapReduce programming model, see what Apache Spark has to offer, where it shines, how to use it via real life examples. more
  • 0 comments
  • Rejected
  • 07 May 2015
Section: Full Talk Technical level: Beginner

Rahul Kavale

Deprecating MapReduce Patterns with Apache Spark

Live coding demostration to show how Apache Spark can solve non trivial problems, and hence deprecating some of the established patterns of MapReduce, with consice code, giving us significant performance gain, and developer friendly programming model also keeping other sweetness of MapReduce wolrd intact. more
  • 2 comments
  • Rejected
  • 07 May 2015
Section: Full Talk Technical level: Intermediate
Bhasker Kode

Bhasker Kode

Instrumenting your kafka & storm pipeline

tips to design your stream processing setup. what all can go wrong, how to instrument it. more
  • 0 comments
  • Confirmed & scheduled
  • 11 May 2015
Section: Full Talk Technical level: Intermediate

Steven Deobald

Two Years Wiser: The Nilenso Experiment

Attendees will hear how nilenso has overcome a series of challenges present in running a technology co-operative. This story will be informative for anyone who wants their team to be more involved, not just for employee-owned companies. Understanding decision-making, execution, and delivery is essential for any business. By describing the structural and procedural challenges we’ve faced over the … more
  • 0 comments
  • Confirmed & scheduled
  • 11 May 2015
Section: Full Talk Technical level: Beginner

Ramesh Sampath

Building Data Products for Small / Mid-Sized Data

Understand the process I and Kevin Gates went through in building www.seeingtheair.com, a hackathon data product to compare Air Quality in various cities. Audience will have an appreciation for - Data Extraction, Exploration phase along with building an Web App and some intuition for Data Viz. I intend to show Python Code behind the app in this talk. more
  • 0 comments
  • Cancelled
  • 12 May 2015
Section: Full Talk Technical level: Intermediate

Bargava Subramanian

Introduction to Deep Learning

In fields like computer vision, speech recognition and natural language processing, deep learning has produced state-of-art results. And they are showing lot of promise in other fields too. more
  • 0 comments
  • Confirmed & scheduled
  • 19 May 2015
Section: Workshop Technical level: Intermediate

Amit Kapoor

Visualising Multi Dimensional Data

To understand techniques to effectively visualise multi dimensional data to aid exploratory data analysis. more
  • 0 comments
  • Confirmed & scheduled
  • 19 May 2015
Section: Full Talk Technical level: Intermediate

Swaroop Krothapalli

Building Recommender system

Will talk about classical and state-of-the-art recommender systems. The audience will also get a flavour of the mathematical computations that go into recommender systems. more
  • 0 comments
  • Confirmed & scheduled
  • 21 May 2015
Section: Crisp Talk Technical level: Beginner

Bargava Subramanian

On building a cloud-based black-box predictive modeling system

Data Analytics platforms, with predictive models at their core, are the buzzword in Enterprise Analytics. Having been on both sides - a consultant providing analytics and a consumer of analytics, I’ve realized that there are few, if any, runaway winners. Rightly so. It is one of the hottest growth areas. This talk would go over some of the ingredients to building a successful data analytics platf… more
  • 0 comments
  • Rejected
  • 21 May 2015
Section: Full Talk Technical level: Beginner

Venkata Naga Ravi

Big Data Benchmarking

Participants will get the knowledge of benchmaring techniques for big data more
  • 0 comments
  • Waitlisted
  • 23 May 2015
Section: Full Talk Technical level: Intermediate

Venkata Naga Ravi

Processing large data with Apache Spark

Overview of Apache Spark functionalities with detailed architecture details. We will touch upon Spark Streaming capability for near real time processing. more
  • 0 comments
  • Confirmed & scheduled
  • 23 May 2015
Section: Full Talk Technical level: Intermediate

Harshad Saykhedkar

Understanding supervised machine learning hands on!

If you have ever been in a “black box” operating mode where you are throwing more data/complex models at a machine learning problem without a clue about why it is working or not working, this workshop is for you! The workshop will primarily focus on understanding supervised machine learning. more
  • 1 comment
  • Confirmed & scheduled
  • 25 May 2015
Section: Workshop Technical level: Beginner

Rajat

Building Spark as Service in Cloud using YARN

Apache Spark is rapidly taking off in popularity as a new data processing framework. However - it can be daunting to install and run it. In this talk we will talk about the challenges of running Spark in the Cloud using YARN and how we have built Spark as a Service. We will also discuss about our learnings from building and operating this service in the AWS cloud and future directions. more
  • 0 comments
  • Rejected
  • 25 May 2015
Section: Full Talk Technical level: Intermediate

Manoj Sundaram

Securing your Enterprise Hadoop Cluster

Hadoop was originally developed for crawling the Internet and indexing - where security is not a concern. But we have come a long way since then. Major banks and organizations are adopting Hadoop as their preferred Big Data platform and there is a growing emphasis on securing the Data and the Cluster components/resources. In a complicated, distributed system like Hadoop, there are several attack … more
  • 0 comments
  • Waitlisted
  • 27 May 2015
Section: Full Talk Technical level: Intermediate
Yagnik

Yagnik

Critical pipe fittings: What every data pipeline requires

The talk aims to provide data builders key aspects that will help them build their own frameworks and tools to add some transparency to their data pipeline and ship faster. more
  • 0 comments
  • Confirmed & scheduled
  • 27 May 2015
Section: Full Talk Technical level: Intermediate

Muktabh Mayank

Making a contextual recommendation engine using Python and Deep Learning at ParallelDots

ParallelDots ( paralleldots.com ) is a recommendation engine for publishers to increase engagement/monetization on their websites. For the end user, it solves the problem of information overload by providing set of relevant stories and history about whatever he/she is reading. ParallelDots provides a set of recommendation engines which include the most accurate related posts widget, automated tim… more
  • 0 comments
  • Confirmed & scheduled
  • 27 May 2015
Section: Crisp Talk Technical level: Beginner

Vaidhy Gopalan

A review of important results in distributed systems

The key objective is to get the attendee a more nuanced appreciation of the constraints placed while designing distributed/fault tolerant systems. At the end of this talk, the attendee should be conversant with some of the key theorms, ideas and common solutions to distributed problems. more
  • 1 comment
  • Confirmed & scheduled
  • 28 May 2015
Section: Full Talk Technical level: Intermediate

Amit Jain

Leveraging Cloud for BigData Analytics - Patterns, Options and Practical Next Steps

This talk will cover in-depth about leveraging public clouds for Big data analytics. It will also describe the next steps to get you started on your cloud based big data analytics initiative. more
  • 0 comments
  • Rejected
  • 28 May 2015
Section: Full Talk Technical level: Intermediate

sudipta mukherjee

Video thumbnail

Squirrel – Enabling Accessible Analytics for All

Simplify and widen the scope of the Software Developer to create smart tools that enable easy access and actionable insights for all. more
  • 0 comments
  • Cancelled
  • 31 May 2015
Section: Crisp Talk Technical level: Intermediate

Kiran Veigas Proposing

Anomaly Detection Using Apache Spark

walk through how we used Sparks scalable KMeans algorithm to detect Anomalies for our Cyber Analytics platform more
  • 1 comment
  • Rejected
  • 01 Jun 2015
Section: Crisp Talk Technical level: Advanced

Ravishankar Rajagopalan

Video thumbnail

High Performance Computing in R

This is a hands-on workshop focused on the high performance aspects of R programming. The attendees would get to learn how to identify the performance issues and address them through the use of various R packages. This workshop is targeted towards audience with a basic familiarity in R. more
  • 0 comments
  • Rejected
  • 03 Jun 2015
Section: Workshop Technical level: Intermediate

Satnam Singh, PhD

HawkEye: A Real-Time Anomaly Detection System

In this talk, I will present the details of the HawkEye system with insights on selection of algorithms and parameter tuning. I intend to share our mistakes and learnings while deveoloping HawkEye. more
  • 2 comments
  • Confirmed & scheduled
  • 08 Jun 2015
Section: Crisp Talk Technical level: Beginner

Vishnuteja Nanduri

IT Operations Analytics: Using Text Analytics and Statistical Modeling in IT Operations Data

Attendees will be exposed to the emerging area of IT Operations Analytics. Attendees will learn how text mining and statistical modeling techniques can be used to extract insights out of IT Operations Data. more
  • 1 comment
  • Waitlisted
  • 08 Jun 2015
Section: Full Talk Technical level: Intermediate

Regunath Balasubramanian

Building tiered data stores using Aesop to bridge SQL and NoSQL systems

Understand how to build and use tiered data stores with Aesop using best-in-class SQL and NoSQL systems. Also relate to a number of real world requirements where this technology and patterns can be applied, while scaling to millions of data records. more
  • 0 comments
  • Confirmed & scheduled
  • 10 Jun 2015
Section: Full Talk Technical level: Intermediate

Anup Nair

Search at Petabyte scale

Learnings around how did we scale and run our search infrastructure in a SaaS world, which crunches 25+ PB data everyday. more
  • 0 comments
  • Confirmed & scheduled
  • 11 Jun 2015
Section: Crisp Talk Technical level: Intermediate

Deepak Krishnan

Running natural language queries against NoSQL schema

Demonstrate and discuss advanced text parsing and processing techniques on UNSTRUCTURED DATA more
  • 4 comments
  • Confirmed & scheduled
  • 11 Jun 2015
Section: Crisp Talk Technical level: Advanced

Gagan Agrawal

Recommendation System beyond traditional Collaborative filtering

I would be sharing my thoughts and experiences at Snapdeal in building more personalized and relevant recommendation system for e-commerce industry by presenting mathematical, technological, machine learning and various other aspects related to it. more
  • 0 comments
  • Confirmed & scheduled
  • 12 Jun 2015
Section: Full Talk Technical level: Intermediate

Shashi Gowda

Escher - democratizing beautiful visualizations

I aim to introduce the audience to Escher.jl - a new tool for web-based interactive visualizations wholly programmable in a single, data-friendly, fast, lanugage - Julia. Hopefully, the pleasant ergonomics of the library will encourage data scientists to create more explorable, beautiful, and insightful presentations of data, and also create user interfaces without an army of front-end developers. more
  • 1 comment
  • Confirmed & scheduled
  • 12 Jun 2015
Section: Crisp Talk Technical level: Beginner

Gagan Agrawal

Building Complex Data Workflows with Cascading on Hadoop

Understand how to build complex data workflow pipelines with cascading on hadoop by taking inputs from different sources and pushing crunched data to different sinks. more
  • 0 comments
  • Shortlisted
  • 13 Jun 2015
Section: Full Talk Technical level: Intermediate

Gagan Agrawal

Aerospike : High Performance NoSQL store with flash optimization

High Performance databases are need of most widely used real-time internet services. Low latency and high throughput has always been of utmost importance in bringing traffic to the site. Aerospike is one such noSql store designed to maintain under 1 millisecond response time even under peak load with billions of records spanning over tera bytes in size. Optimized for flash storage, aerospike can … more
  • 0 comments
  • Rejected
  • 13 Jun 2015
Section: Full Talk Technical level: Intermediate

Swaroop Krothapalli

Ensemble Learning

To understand most basic and convenient approaches of ensembling more
  • 0 comments
  • Rejected
  • 14 Jun 2015
Section: Full Talk Technical level: Beginner

Srinivasa Rao Aravilli

Video thumbnail

Benchmarks from JVM to Big Data

Explain about various benchmarks related to JVM and Big Data more
  • 0 comments
  • Rejected
  • 14 Jun 2015
Section: Full Talk Technical level: Intermediate

Kaushik Paranjape

Big Data Engineering made easy

Switching the database for scaling up and then porting all the algorithms / reporting functionalities that had been implemented to the new database is a challenge. At Sokrati we have eased this pain by implementing proprietery APIs (for internal use). more
  • 4 comments
  • Rejected
  • 14 Jun 2015
Section: Full Talk Technical level: Intermediate

Viral B. Shah

The many ways of parallel computing with Julia

Introduce Julia for those who haven’t heard about it, and focus on parallel computing with Julia. I will try to do some fun stuff with a 1000 processors in a demo. more
  • 0 comments
  • Cancelled
  • 14 Jun 2015
Section: Full Talk Technical level: Beginner

Viral B. Shah

The many ways of parallel computing with Julia

Introduce Julia for those who haven’t heard about it, and focus on parallel computing with Julia. Do some demos with hundreds of processors. The audience will get a feel for parallel computing with Julia and is strictly advised to “Try it at home.” more
  • 1 comment
  • Confirmed & scheduled
  • 14 Jun 2015
Section: Full Talk Technical level: Beginner

Vishal

Deconstructing Linear Regression

This short talk aims to “deconstruct” Linear Regression and explain the steps done by the library functions before throwing out the intercept and slope. more
  • 1 comment
  • Rejected
  • 14 Jun 2015
Section: Crisp Talk Technical level: Beginner
Bhasker Kode

Bhasker Kode

POC: How to slice, dice & search billions of users events in seconds (from scratch)

results from a proof of concept business intelligence tool, where each bit in a multi-billion bitmap, represented a user performing an event. a minimal 100 LOC implementation gave encouraging results, and also areas that could improve - caveats, ideas to roll out your own BI tool. more
  • 3 comments
  • Confirmed & scheduled
  • 14 Jun 2015
Section: Crisp Talk Technical level: Beginner

Siddhartha Reddy

CAP Theorem: You don’t need CP, you don’t want AP, and you can’t have CA

CAP Theorem is everywhere: "Consistency, Availability, Partition tolerance — choose any two!” But it is oversimplified and misunderstood more often than not. CAP’s consistency isn’t what most people think it is; CAP’s availability isn’t what most people think it is; what does partition-tolerance even mean? more
  • 13 comments
  • Confirmed & scheduled
  • 14 Jun 2015
Section: Full Talk Technical level: Intermediate

Amit Kapoor

Static & Interactive Exploratory Data Analysis in R

Learn to quickly do static and interactive visual exploration of large datasets in R more
  • 0 comments
  • Waitlisted
  • 15 Jun 2015
Section: Workshop Technical level: Intermediate

Himadri Sarkar

Approximate algorithms for summarizing streaming data

Introduce two approximate algorithms which are considered cornerstone of big data infrastructure. more
  • 5 comments
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Rajesh Balamohan

Apache Tez - Present and Future

Talk about the present and future of Apache Tez. Outline more
  • 0 comments
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Anand S

Automating news discovery in real-time

The breaking news segment is an intensely competitive market with players from the TV, radio, online, mobile and print space competing for attention. The ability to discover trends early and “break” them is an edge. more
  • 0 comments
  • Cancelled
  • 15 Jun 2015
Section: Full Talk Technical level: Beginner

Rohit Chatterjee

Video thumbnail

Using Modes for Time Series Classification

To present methods for time series analysis other than ARIMA etc. more
  • 0 comments
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Crisp Talk Technical level: Beginner

Sudhir Rawat

Getting Started with IoT

How IoT solution can be delivered with ease Different options available for building IoT solution Understanding of solution architecture more
  • 0 comments
  • Rejected
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Saurabh Banerjee

Anatomy of Decision Trees using an example from Kaggle

Decision trees are amongst the most popular predictive modelling techniques in the analytics industry. Attendees will learn how to effectively apply decision trees to predict survival on the Titanic: Machine Learning from Disaster problem in Kaggle. more
  • 5 comments
  • Rejected
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Sudhir Rawat

Building Real time solution within 30 minutes

Understanding the feature available to build the solution within 30 minutes. Ease of the technology. How to buid realtime solution in less time even if you are not an hard core developer. more
  • 0 comments
  • Rejected
  • 15 Jun 2015
Section: Crisp Talk Technical level: Beginner

Nikhil Ketkar

Are these the same pair of shoes? - Matching retail products at scale

Matching identical products from different retail websites is one of the hardest and the most impactful problems in the space of product intelligence. This talk will cover the breadth of algorithms and models we use for matching products across customer catalogs. It will also cover some practical aspects of taking these algorithms and models to production. more
  • 1 comment
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Dhanesh Padmanabhan

An Integrated Weblog Processing and Machine Learning Workflow for Building and Deploying Intent Prediction Models at Scale

To share with the audience our experiences in setting up a scalable infrastructure for weblog processing and machine learning leveraging several technologies such as Hadoop, Vertica, R and Python. The talk will focus on implementing scalable data models for dynamic intent predictions on web/mobile channels and machine learning best practices. more
  • 0 comments
  • Rejected
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Kausik Ghatak

Practical Approach to Python based Supervised Machine Learning: User Generated Text Classification Techniques

In e-Commerce, we handle large volume of user genearted content in the forms of Reviews, Ratings, Question/Answer, Chat etc. These user generated content has lot of values in terms of taking right organization-wide business decission. This large volume of user generated text also imposes problem of classificaiton and moderation because the data is mostly unstructured. Combination of various Machi… more
  • 0 comments
  • Rejected
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Vinodh Kumar R

Building a E-commerce search engine: Challenges, insights and approaches

The objective of the talk is to motivate the problems and challenges of e-commerce search and provides insights and approaches on how one can go about building a world class product search engine. more
  • 0 comments
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Sponsored Technical level: Beginner

Srihari Sriraman

postgres clusters and their nuances

We built a postgres cluster using repmgr to serve 2k requests per second, and store 5G of data per day. You’ll learn about postgres’ WAL replication and archival, how repmgr works, how we leveraged it for our needs, hooked it up to our application, and built multiple lines of defence in case something bad happens. And oh, we’ll also compare it with RDS for good measure. more
  • 2 comments
  • Waitlisted
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Raghu Kashyap

Revolutionizing travel with ML & Analytics – An insight into business optimization using Machine Learning and Advanced Analytics

At Orbitz, Big Data technologies have helped transform the way we let people travel. In this talk we elaborate on how we at Orbitz have leveraged intelligence derived from more than 2 PB of semi-structured and unstructured data to optimize various facets of our business such as content optimization, search personalization and channel optimization. more
  • 1 comment
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Reetinder Sidhu

Hardware Accelerated Big Data Processing

Expect attendees to obtain: Clear understanding of FPGAs (Field Programmable Gate Arrays), and their pros/cons over software on microprocessors for big data more
  • 0 comments
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Crisp Talk Technical level: Intermediate

Suchitra Amalapurapu

Solr compute cloud - An elastic Solr infrastructure

Go over various challenges in scaling solr search platform to serve hundreds of millions of documents with low latencies and high throughput in a multi tenant architecture. more
  • 0 comments
  • Rejected
  • 15 Jun 2015
Section: Full Talk Technical level: Advanced

Aniruddha Gangopadhyay

Joining data streams at scale for fun and profit

Understand how to derive more value out of real-time data streams by joining them using a stream processing system to derive deeper insights. We’ll walk through our experience of building a platform for such use-cases at Flipkart, and describe the design patterns we have evolved within it; we have scaled this platform to process billions of events a day across hundreds of streaming data applicati… more
  • 0 comments
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Crisp Talk Technical level: Beginner

Paul Meinshausen

Developing a Hybrid Recommender System for Some of Life’s Most Important Choices

Recommender Systems are both an old and an active area of research. Advances in Recommender Systems can emerge from developing applications in new contexts and for new use cases. In this session we will describe the unique challenges associated with building a recommender system for real estate and we will present the work we are doing to develop a hybrid recommender system for real estate at Hou… more
  • 0 comments
  • Rejected
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Gagandeep singh

What does your website look like to a web-crawler

The objective of this talk is to go deeper into what site structure is, potential problems with respect to discoverability of your website from the perspective of a web crawler and how to go about fixing it. more
  • 0 comments
  • Shortlisted
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Aditya Prasad Narisetty

Video thumbnail

Data Infrastructure for Real Time Analysis of User Click Stream Data

India is churning out a large number of service oriented startups by the day. They need to build customized views for users based on those users’ previous sessions and interactions with the product. Most startups can’t afford to design, build and maintain a custom Data Analytics Pipeline let alone do real-time data analysis and refine user interactions with the product. Most startups have few dev… more
  • 0 comments
  • Submitted
  • 15 Jun 2015
Section: Full Talk Technical level: Beginner

Ronak

Designing distributed components in a multi tenant architecture

The objective of this talk is to go over design of distributed search components in a Multi-tenant architecture spanning across geographies and deals with challenges around custom ranking, tenant specific configurations and dynamic ranking elements. more
  • 0 comments
  • Rejected
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Devashish Shankar

Video thumbnail

Deep Learning for Natural Language Processing

This talk is about how we applied deep learning techinques to achieve state-of-the-art results in various NLP tasks like sentiment analysis and aspect identification, and how we deployed these models at Flipkart more
  • 1 comment
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Niranjan Bala V

Map Tile Server

Problem Statement Our previous maps at CommonFloor used JavaScript to show listings on google maps. The user experience would break in high density, zoom/pan, application of filters. To solve the problem, we have overlaid our own tiles (transparent PNG images of size 256 x 256) on google map. These tiles are generated from backend and improve the maps experience significantly by making the whole … more
  • 0 comments
  • Submitted
  • 15 Jun 2015
Section: Crisp Talk Technical level: Intermediate

Siddhartha Reddy

Stream Processing in production: Metrics that matter

Understand what are some useful metrics to monitor the health of stream processing jobs (such as Apache Storm topologies) when they are deployed in production. Also get some ideas on how to capture these metrics (including suggestions for libraries & tools), and how to proactively mitigate the problems from escalating. more
  • 0 comments
  • Rejected
  • 15 Jun 2015
Section: Crisp Talk Technical level: Intermediate

Anand Chandrasekaran

Keeping Moore's law alive: Neuromorphic computing

This talk explores the implications of Neuromorphic Engineering, or ‘building brains in silicon’, on the development of extremely parallel compute techniques such as deep learning. more
  • 0 comments
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Full Talk Technical level: Beginner

Pranav Agarwal

Exploratory data analysis using Apache Lens and Apache Zeppelin

Apache lens is an analytics platform that aims to cut the Data Analytics silos by providing a single view of data across multiple tiered data stores and optimal execution environment for the analytical query. It seamlessly integrates Hadoop with traditional data warehouses to appear like one. Zeppelin is a web based notebook that enables interactive data analytics. more
  • 1 comment
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Crisp Talk Technical level: Intermediate

chinmayi sk

Holistic Security Process for Humanitarian Projects

Humanitarian projects usually contain sensitive information and are more prone to risks. Hence it is important to include security holistically in project planning. The objective of the session would be to present good data security practises to be followed while working on a humanitarian project. more
  • 1 comment
  • Rejected
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Abhijit Pratap Singh

Harnessing the power of the Erlang VM at Housing

RoR and Django has ensured we remain productive in the face of rapidly changing product requirements at Housing. However we ran into issues of memory and speed when we had to scale throughput and interface with other services in our SOA. This talk describes how we rewrote some core parts of our infrastructure to ride on the coattails of the awesome Erlang VM. more
  • 0 comments
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Crisp Talk Technical level: Intermediate

Sumod Mohan

Graph Algorithms and Computer Vision

Discover some of the interesting connections between various sub-areas of Machine Learning, Analytics and Computer Vision. Specifically how Random Walk on a Graph can give clustering of data, how clustering can help in Segmentation (image/video) of data and how many of these can boil down to eigen decomposition of a specially crafted matrix of graph data. more
  • 0 comments
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Vivek Mehta

How to stop admiring and start using Deep Learning

Deep Learning results looks very fascinating but it seems to require a huge infra to start using it. In this talk, we present how to approach it in incremental manner to make real use of Deep Learning. more
  • 0 comments
  • Rejected
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Jasvinder Singh

Scalable real-time personalized recommendation system

This talk goes over some challenges in scaling a real time personalized recommendation system that can dynamically adapt to user actions and incorporate these signals into various applications like search, recommendations, predictive suggestions etc. more
  • 0 comments
  • Submitted
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

ravi teja

Think Incremental with hive.

Hive is a data warehouse infrastructure over hadoop for summarization, query, and analysis of data. We propose a incremental processing approach for hive. This would optimise the data processing speeds upto ~70% . more
  • 0 comments
  • Shortlisted
  • 15 Jun 2015
Section: Crisp Talk Technical level: Intermediate Session type: Demo

Shubham Bansal

High Performance Tiled Map Service

To introduce the service behind the housing map interface and discuss the technical challenges and performance bottlenecks faced while developing the tiling map service. more
  • 0 comments
  • Shortlisted
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Mudit Gupta

From Search to Discovery at Housing

The objective of this session is to introduce a framework and models for search recommendations through real-time user click stream analysis. We will be talking about various architectural challenges and challenges in modeling the expert system and how it can be used in different domains. more
  • 0 comments
  • Waitlisted
  • 15 Jun 2015
Section: Full Talk Technical level: Beginner

Shalin Mangar

Call me maybe: Jepsen and flaky networks

Tell people that network partitions happen often enough that it is worth caring about how their distributed data stores respond in such situations more
  • 0 comments
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Full Talk Technical level: Advanced
Vedang Manerikar

Vedang Manerikar

Dead Simple Scalability Patterns

Everyone dreams of being ‘Web Scale’, but we start out small. We — most of us — don’t launch a service and expect it to serve millions of requests from Day 1. This means that we don’t think about the ways in which our stack will blow up when the number of requests does start climbing. This talk lists simple patterns and checks that Development and Operations teams should implement from Day 1 in o… more
  • 0 comments
  • Confirmed & scheduled
  • 15 Jun 2015
Section: Crisp Talk Technical level: Beginner

Kapil Reddy

Building a distributed cache system with redis, clojure and math

Learn how consistent hashing, CRDTs and Clojure protocols can be used to build a distributed cache. more
  • 0 comments
  • Rejected
  • 15 Jun 2015
Section: Full Talk Technical level: Intermediate

Renuka Khandelwal

AB testing: What, Why & How

Understand, what AB testing is? why is it a great tool? how to experiment correctly? and learnings. more
  • 0 comments
  • Rejected
  • 16 Jun 2015
Section: Full Talk Technical level: Beginner
Mohit Kumar

Mohit Kumar

Reviews and Ratings Spam Detection

The audience will be able to walk away with a summary view of the state-of-the-art techniques in reviews and rating spam detection and their application at Flipkart. more
  • 1 comment
  • Shortlisted
  • 16 Jun 2015
Section: Crisp Talk Technical level: Intermediate

Rakesh R

When Apache ZooKeeper is good fit

This talk focus on fitment of ZooKeeper for various use cases. Attendees will learn how effectively use ZooKeeper in the distributed clusters. more
  • 0 comments
  • Confirmed & scheduled
  • 16 Jun 2015
Section: Crisp Talk Technical level: Intermediate

Jatinder Singh

Introduction to MaelStorm and Performance Engineering

Learn to build highly performant and scalable backend using java more
  • 0 comments
  • Rejected
  • 16 Jun 2015
Section: Workshop Technical level: Advanced

Tim Poston

Data Comes in Shapes

Data comes in shapes. The study of shape is geometry, in as many dimensions as you have variables. You can’t visualise them all, but you can see in 2D and 3D why the algebraic tools work the way they do more
  • 0 comments
  • Confirmed & scheduled
  • 16 Jun 2015
Section: Keynote Technical level: Beginner

Jatinder Singh

Real Time Bid Modification @ Million Requests per second...

Learning around building high performance software systems a capable of handling million requests per second while keeping response time under 10 ms. more
  • 0 comments
  • Waitlisted
  • 17 Jun 2015
Section: Crisp Talk Technical level: Intermediate

Russell Nash

Deploying Batch and Streaming Architectures on AWS

To learn about the key Big Data and Analytics services on AWS and how they can be used for both batch and streaming workloads. more
  • 0 comments
  • Confirmed & scheduled
  • 18 Jun 2015
Section: Sponsored Technical level: Intermediate
Yagnik

Yagnik

Igniting your data with Apache Spark

Introduce the audience to Spark and it’s API with hands on exercise. The workshop will also deal with deploying and configuring Spark. Finally the workshop will lead into building data applications on top of spark and some lessons from Shopify. more
  • 5 comments
  • Confirmed & scheduled
  • 02 Jul 2015
Section: Workshop Technical level: Beginner

Amod Malviya

Future patterns in data ecosystem

Understand emerging patterns of data consumption and processing to devise better data systems. more
  • 0 comments
  • Confirmed & scheduled
  • 07 Jul 2015
Section: Sponsored Keynote Technical level: Intermediate

Shailesh Kumar

"Thinking Machines"

To explore the key building blocks of Artificial Intellgence: “Understanding”, “Learning”, “Thinking”, and “Creativity”. more
  • 0 comments
  • Confirmed & scheduled
  • 08 Jul 2015
Section: Keynote Technical level: Advanced

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more