Submissions

The Fifth Elephant 2013

An Event on Big Data and Cloud Computing

(Skip ahead to session proposals)

In 2013, commodity hardware and computing capacity for storing and processing large and small volumes of data are easily available on demand. The bigger issues pertain to questions of how to scale data processing, handle data diversity, manage infrastructure costs, decide which technologies work best for different contexts and problems, and build products from the insights and intelligence that the data is presenting to you.

The Fifth Elephant 2013 is a three-day workshop and conference on big data, storage and analytics, with product demos and hacker corners.

http://fifthelephant.in/

Event format, themes and submission guidelines

The Fifth Elephant 2013 invites proposals on use cases and real-life examples. Tell us what specific problem you faced, which technology/tools worked for your use case and why, how you have developed business intelligence on the data you are collecting, and analytics tools and techniques you employ. Our preference is for showcasing original work with clear take-aways for the audience. Please emphasize these in your proposal.

The conference will have two parallel tracks on 12th and 13th July:

  1. Storage: OLTP, messaging and notifications, databases and big data, NoSQL
  2. Analytics: Metrics and tools, cloud computing, mathematical modelling and statistical analysis, visualization

Workshops

This year we are adding a preliminary day of workshops, on 11th July, to provide attendees more in-depth, hands-on training on open source frameworks and tools (Pig, Hadoop, Hive, etc), commercial solutions (sponsored), programming languages such as R, and visualization techniques and tricks, among others.

Product demos and sponsored sessions

We have a demo track for startups and companies who want to showcase their product to customers at The Fifth Elephant 2013 and get feedback. Slots are also open for 4-6 sponsored sessions for companies who want to talk about their technologies and reach out to developers, CTOs, CIOs and product managers at The Fifth Elephant. For more information on demo and sponsored session proposals, write to info@hasgeek.com.

Commitment to open source

HasGeek believes in open source as the foundation of the internet. Our aim is to strengthen these foundations for future generations. If your talk describes a codebase for developers to work with, we require that it is available under a license that does not impose itself on subsequent work. This is typically a permissive open source license (almost anything that is listed at opensource.org/licenses and is not GPL or AGPL), but restrictive and commercial licenses are also considered depending on how they affect the developer’s relationship with the user.

If you’d like to showcase commercial work that makes money for you, please consider supporting the event with a sponsorship.

Proposal selection process

Voting is open to attendees who have purchased event tickets. If there is a proposal you find notable, please vote for it and leave a comment to initiate discussions. Your vote will be reflected immediately, but will be counted towards selections only if you purchase a ticket. Proposals will also be evaluated by a program committee consisting of:

Emphasis will be placed on original work and talks which present new insights to the audience.

The programme committee will interview proposers who have received maximum votes from attendees and the committee. Proposers must submit presentation drafts as part of the selection process to ensure the talk is in line with the original proposal and to help the program committee build a coherent line-up for the event.

There is only one speaker per session. Attendance is free for selected speakers. HasGeek will cover your travel to and accommodation in Bangalore from anywhere in the world. As our budget is limited, we will prefer speakers from locations closer home, but will do our best to cover for anyone exceptional. If you are able to raise support for your trip, we will count that towards an event sponsorship.

If your proposal is not accepted, you can buy a ticket at the same rate as was available on the day you proposed. We’ll send you a code.

Discounted tickets are available from http://fifthelephant.doattend.com/

Dates

The program committee will announce the first round of selected proposals by end of April, a second round by end-May, and will finalize the schedule by 20th June. The funnel will close on 5th June. The event is on 11th-13th July 2013.

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Accepting submissions

Not accepting submissions

Supreeth Proposing

Building a massively multiplayer online role-playing game (MMORPG) using Cloud

To introduce users to building MMORPGs using the cloud. To share experiences of doing the same. more
  • 0 comments
  • Submitted
  • 19 Mar 2013
Section: Storage and Databases Technical level: Beginner

Shekhar Gulati

Building Location Aware Applications using MongoDB

The benefits for attendees Attendees will learn how they can use MongoDB geo spatial indexing capabilities to build location aware applications. more
  • 2 comments
  • Submitted
  • 21 Mar 2013
Section: Workshops Technical level: Beginner

Arthi Venkataraman

Similar entity detection in large data

Understand Similar Entity recognition and it’s industrial applicability more
  • 2 comments
  • Confirmed & scheduled
  • 26 Mar 2013
Section: Analytics and Visualization Technical level: Intermediate

Deepak Shenoy

Money Talks: Analyzing Financial Market Data

Financial markets produce a ton of data, but how can we look at them in useful ways, as compared to “looks-great-what-do-I-do-now”. By useful I mean to traders, to fraud-detectors, to investors and even to company management. Learn about the techniques of market data analysis from someone who’s done all the wrong things, sometimes in spectacular fashion. more
  • 2 comments
  • Submitted
  • 26 Mar 2013
Section: Analytics and Visualization Technical level: Intermediate

Mahesh Rangarajan

Big Data Analytics for improving Patient Care systems at hospitals

Share real world experience and learning from implementing large scale patient care systems at hospitals leveraging Big Data platform. more
  • 1 comment
  • Submitted
  • 26 Mar 2013
Section: Analytics and Visualization Technical level: Advanced

prakash babu

Implementing a Large Scale Surveillance System Using Big Data

To Share the Key Learnings,Architecture Patterns and Best Practices in implementing a Large Scale Real time Analytics System leveraging Big Data Technologies. more
  • 2 comments
  • Submitted
  • 26 Mar 2013
Section: Analytics and Visualization Technical level: Advanced

Kashyap Kompella

A 360 degree view of 3-D printing

(1) Provide a brief tour of the exciting world of 3D printing & (2) Discuss opportunities and applications in the Indian context more
  • 1 comment
  • Submitted
  • 26 Mar 2013
Section: Analytics and Visualization Technical level: Beginner

Mayank Sharma

Transferring Gigabytes of Data to cloud at 10mbps on your 10mbps link

TCP/IP is known to be a pretty robust, reliable and fair mode of data transport. But what about the actual real throughput when you are transferring GB’s of data on a 10mbps link to cloud which is maybe 15 hops away. more
  • 0 comments
  • Submitted
  • 26 Mar 2013
Section: Storage and Databases Technical level: Intermediate

Pankaj Risbood

Extracting consumer trends in real time using 100 billion tweets.

Connected consumers express everything they feel about products, services, brands in social media. How consumer feel and engage with products is very important for retailers to make better merchandizing decisions. more
  • 2 comments
  • Confirmed & scheduled
  • 27 Mar 2013
Section: Analytics and Visualization Technical level: Intermediate

Karthik Kastury

Unlocking the Potential of Data for Everyday Developers and Product Managers

Developers love writing code, and shipping new features. Product Managers love observing customer behaviours and taking product decisions! more
  • 2 comments
  • Confirmed & scheduled
  • 28 Mar 2013
Section: Analytics and Visualization Technical level: Intermediate

Prashanth Babu

Big Data, Real-time Processing and Storm

Participants will learn: And understand concepts and salient features of Storm. more
  • 2 comments
  • Confirmed & scheduled
  • 29 Mar 2013
Section: Workshops Technical level: Beginner

Y

Uncovering the truth in sales through Visualization

How can visualization help a sales manager review his team’s performance? more
  • 1 comment
  • Submitted
  • 29 Mar 2013
Section: Analytics and Visualization Technical level: Beginner

Shanker Balan

Build A Cloud With Apache CloudStack For Big Data

To build a Proof-Of-Concept IaaS cloud that can power big data workloads using Apache CloudStack Cloud Management Platform and Apache Hadoop. more
  • 2 comments
  • Submitted
  • 29 Mar 2013
Section: Workshops Technical level: Intermediate

Viral B. Shah

Julia: A fresh approach to technical computing and data science

Julia is a new high performance, open source, dynamic language for technical computing and especially relevant for the upcoming field of data science. I will describe the rationale and the vision behind julia, key language features, and show some demos so that attendees can get a feel for the language. I will also discuss Julia’s open source development process and the community that keeps adding… more
  • 2 comments
  • Confirmed & scheduled
  • 01 Apr 2013
Section: Analytics and Visualization Technical level: Beginner

Sameer Segal

Big Data at the Base of the Pyramid

I was actually tempted to give this talk a tongue-in-cheek title like “Big Data with No Data” because “data” at the Base of the Pyramid (BoP) is an ephemeral and elusive thing. more
  • 4 comments
  • Submitted
  • 03 Apr 2013
Section: Analytics and Visualization Technical level: Intermediate

Sirish M Simha

Big Data Product Ideas - Building Interactive BI Analytics

As more and more companies are embracing Big Data ( read as Hadoop for now) technologies, there are opportunities for companies and geeks to address the ‘GAPS’ by designing and implementing products on Hadoop platform. more
  • 1 comment
  • Submitted
  • 03 Apr 2013
Section: Analytics and Visualization Technical level: Intermediate

Vishwanath Belur

Big Data Predictive Analysis in SAP HANA with SAP Predictive Analysis

Traditionally, analyzing big volumes of data using statistical techniques was either not possible or was taking a lot of time and hence businesses were losing out the advantage of storing very huge amount of historical data. With SAP HANA, it is possible to perform big data predictive analytics in real time so that businesses can make informed real time decisions. SAP Predictive Analysis simplifi… more
  • 5 comments
  • Submitted
  • 03 Apr 2013
Section: Analytics and Visualization Technical level: Intermediate

Regunath Balasubramanian

Latency and Fault tolerance in OLTP @ 1.5 billion/day service calls

User perceived Availability and Experience is important for any eCommerce site. Achieving this is not easy for distributed systems that run multiple platforms and access multiple resources, data sources. The data sources span MySQL, Key-Value stores and Columnar databases storing OLTP data to the order of tens of millions. This talk describes how Flipkart built its website to manage Latency and F… more
  • 6 comments
  • Confirmed & scheduled
  • 05 Apr 2013
Section: Storage and Databases Technical level: Intermediate

Raghu Kashyap

Big Data is it a fad or future?

See how Hadoop has progressed over the last 4 years from an early adapter organization standpoint. Learn about the use cases that are operational and not just a POC more
  • 4 comments
  • Submitted
  • 08 Apr 2013
Section: Storage and Databases Technical level: Intermediate

Ritesh Nayak

Deciphering the organizational DNA - mining internal data

If you attend this talk, you will : See brilliant visualizations on data inside an enterprise and learn some social network analysis/graph theoretic techniques. more
  • 1 comment
  • Submitted
  • 09 Apr 2013
Section: Analytics and Visualization Technical level: Beginner

Shailesh Kumar

Co-occurrence Analytics: A versatile framework for finding interesting needles in crazy haystacks!

In this session we will learn about a new way of thinking about data mining and big data analytics, “Co-occurrence Analytics” - a unified framework for mining latent insights in a wide variety of data of the form: “relationships between entities”. We will show how the framework can be used to discover... more
  • 2 comments
  • Confirmed & scheduled
  • 10 Apr 2013
Section: Analytics and Visualization Technical level: Advanced

Vishnu H Rao

MySQL Robbins - Various Flavors of Files & Buffers it Uses

Learn about the various Files & In-Memory Buffers MySQL creates and how it uses them. more
  • 0 comments
  • Submitted
  • 12 Apr 2013
Section: Storage and Databases Technical level: Beginner

Vishnu H Rao

Reporting Using MySQL Multi-Source Replication

How can we do real time reporting using MySQL when data is spread across different MySQL instances? more
  • 0 comments
  • Submitted
  • 12 Apr 2013
Section: Storage and Databases Technical level: Beginner

Srihari Srinivasan

It takes two to tango! - Is SQL-on-Hadoop the next big step?

To explore the trend of SQL-on-Hadoop. This talk will focus on some of the recent attempts (OSS and Commercial) to get SQL running on Hadoop. more
  • 4 comments
  • Confirmed
  • 12 Apr 2013
Section: Storage and Databases Technical level: Intermediate
Piyush Verma

Piyush Verma

Build a Queue Based Concurrent Task Processor (using Python)

Learn how to develop a Persistent Queue based Task Processor using simple tools like MongoDB and Python. more
  • 4 comments
  • Submitted
  • 13 Apr 2013
Section: Workshops Technical level: Advanced

Chandramouli Mahadevan

Analyzing Terabytes of Data with Google BigQuery

An attendee would understand how they can use Google BigQuery to analyze very large data sets in a very simple fashion using interactive SQL analysis. We will also offer insight into the implementation of the Dremel engine that powers Google BigQuery. more
  • 9 comments
  • Confirmed & scheduled
  • 13 Apr 2013
Section: Analytics and Visualization Technical level: Beginner

Sandeep Ravichandran

Building a high performance distributed crawler

This talk describes how we use NoSQL databases like Mongodb, Redis to store a huge amount to data and analyze it using tools like elasticsearch. It also aims to provide insight to leveraging different cloud services to build a high performance cluster for web crawling. more
  • 2 comments
  • Submitted
  • 15 Apr 2013
Section: Storage and Databases Technical level: Intermediate

Koushik

Breaking Barriers - Showing the funny

To have people troll at me and with me when I talk technology and a lot of other things. If you’ve seen comic relief, The tonight show, or comedy central well, I have also seen these shows. Hence I assume you know what is going to come at you. Amen! more
  • 2 comments
  • Submitted
  • 17 Apr 2013
Section: Analytics and Visualization Technical level: Intermediate

Aaron Morton

Apache Cassandra for Fun and Profit

Attendees will lean about data modelling and performance in Cassandra, which patterns to follow and which to avoid. They will also lean how to evaluate the performance of various models and plan for data growth. more
  • 1 comment
  • Submitted
  • 21 Apr 2013
Section: Storage and Databases Technical level: Intermediate

Shailesh Kumar

MapReduce and the "Art of Thinking Parallel"

The goal of the session is to take the audience from the “MECHANICS of using MapReduce” (to do simple slicing and dicing of BigData) to the “ART of using MapReduce” to solve more complex problems that at first glance look “unnatural” for MapReduce! more
  • 3 comments
  • Confirmed & scheduled
  • 23 Apr 2013
Section: Analytics and Visualization Technical level: Advanced

Neeta Pande

Big Data Analytics with R

An attendee would understand High Performance and Parallel Computing landscape in R. This area in R is undergoing rapid change and objective of this session is to provide insight into various active contributions in this area. In the session, we would also delve deeper into analyzing moderately large data sets which presents huge opportunity today as a solution to “everything in memory” challenge… more
  • 1 comment
  • Submitted
  • 23 Apr 2013
Section: Analytics and Visualization Technical level: Intermediate

Karthik Shashidhar

7 Ways to call elections using data

Psephology has been turning more into a science than an art. This session explains how you can call elections using only data, even if you don’t have much domain knowledge. more
  • 1 comment
  • Submitted
  • 24 Apr 2013
Section: Analytics and Visualization Technical level: Beginner

Ramana Reddy

Open Source Business Intelligence - Pentaho BI Suite

Participants will be taken through the basics of Business Intelligence tools. Various tools available in the market with focus on open source tools. Participants will be shown how to use Pentaho for data integration and reporting purposes. more
  • 0 comments
  • Submitted
  • 25 Apr 2013
Section: Product Demos Technical level: Intermediate

Rajat Venkatesh

Workflow Schedulers: The Heart Beat of a Big Data Stack

With use cases of how Qubole customers use the Scheduler product, I’ll talk about: more
  • 0 comments
  • Confirmed & scheduled
  • 26 Apr 2013
Section: Storage and Databases Technical level: Intermediate

t3rmin4t0r

HOWTO run a hadoop cluster on a laptop

Most of the tutorials involving being consumers of hadoop instead of being developers of the core technology. And even then, doing it without the backing of a company or someone else to foot the bill for your cluster hardware is a problem that’s missing in most FAQs. more
  • 1 comment
  • Confirmed & scheduled
  • 28 Apr 2013
Section: Storage and Databases Technical level: Beginner

Pradeep Kumar G.S.

Low Latency Access of Bigdata using Spark and Shark.

This session aims at introducing latest Big data technology which involves in low latency access and in-memory data store using Spark framework. more
  • 2 comments
  • Submitted
  • 30 Apr 2013
Section: Storage and Databases Technical level: Beginner

Apoorva Gaurav

Cloud based low cost, low maintenance, scalable data platform

The session aims at companies and individuals who are contemplating of pluging into big data world but are avoiding it due to upfront technical and monetary investments. Some of the questions it tries to answer are :- more
  • 0 comments
  • Confirmed & scheduled
  • 30 Apr 2013
Section: Storage and Databases Technical level: Beginner

Bharath Mohan

What is Multi-Stream Retrieval?

Multi-stream retrieval is about humans querying, exploring and discovering from streams of information. more
  • 0 comments
  • Submitted
  • 30 Apr 2013
Section: Analytics and Visualization Technical level: Intermediate

Andreas Kollegger

Neo4j Graphs: What, When, How

You will leave with an understanding of what a graph database is, what advantages it can offer, and when to use one. We’ll focus on Neo4j, quickly covering it’s capabilities then looking at some real world use cases from Fortune 500 companies. more
  • 8 comments
  • Confirmed & scheduled
  • 30 Apr 2013
Section: Storage and Databases Technical level: Beginner

Pranav Modi

Uncovering patterns and forecasting with time series data

Understand time series analysis and its applications in industry and science. Uncover patterns in data - trends, seasonality, cyclical behavior. more
  • 0 comments
  • Confirmed
  • 30 Apr 2013
Section: Analytics and Visualization Technical level: Intermediate

Abhishek Kona

The database cannot be better than the underlying datastructure

Understand the common underlying datastructures in current storage engines, the trade offs and why this should drive decide which database to use for your next app. more
  • 0 comments
  • Submitted
  • 01 May 2013
Section: Storage and Databases Technical level: Intermediate

Lakshman Prasad

Interactive analysis of data live, using Pandas, Matplotlib and IPython

The session is a live coding session to analyse various datasets using Pandas and plotting them live, in an IPython notebook. more
  • 3 comments
  • Confirmed & scheduled
  • 01 May 2013
Section: Analytics and Visualization Technical level: Beginner

Anand S

Visualising networks

This talk shows ways of visualising network data at a large and a small scale. more
  • 7 comments
  • Confirmed & scheduled
  • 02 May 2013
Section: Analytics and Visualization Technical level: Intermediate

Swaroop Krothapalli

Can twitter kill Boeing 787 ?

Of late, leading brands have realized the potential opportunities Twitter could provide, much beyond the realm of advertising or making product announcements. Customers are increasingly turning towards this micro-blogging service to talk about various products, lodge complaints and discuss everyday events. As a result, the “buzz” that a brand has over twitter is not just a function of its recent/… more
  • 0 comments
  • Submitted
  • 03 May 2013
Section: Analytics and Visualization Technical level: Beginner

Rajan Chandi

Why we went 100% NoSQL with Mongodb?

Highlighting key differences between application design using relational and NoSQL databases. What are the ways in which NoSQL can save development times and getting the product out faster, while maintaining high performance and future scalability at launch. more
  • 0 comments
  • Submitted
  • 08 May 2013
Section: Storage and Databases Technical level: Intermediate

Andreas Kollegger

Neo4j Graph Workshop

Establish the technical foundation for working with the Neo4j graph database. From install, to fundamental data modeling to querying. more
  • 2 comments
  • Confirmed & scheduled
  • 10 May 2013
Section: Workshops Technical level: Beginner

Russell Sullivan

Customizing One Database for Your Multiple Data Structures

Attendees will gain hands-on experience on how to address the challenge of managing multiple types of structured and unstructured data by customizing data-structures to accurately represent their data as it exists and is queried in its natural form, to attain an impedance match between data in the wild and its model. The data structures will then go through a series of customizations to optimize … more
  • 0 comments
  • Submitted
  • 10 May 2013
Section: Workshops Technical level: Advanced

brian bulkowski

Evaluating SSD Performance for Databases Handling Real-Time Big Data

Attendees will learn how to evaluate the performance of flash-based SSDs for managing high-velocity big data using the open source ACT benchmark tool. more
  • 0 comments
  • Confirmed & scheduled
  • 10 May 2013
Section: Storage and Databases Technical level: Intermediate

Harpreet Singh

Analysis of genomics data and linking to phenotype of country population to identify health markers

A key factor in determining an individual’s susceptibility to disease as well as response to treatment should include the recognition of both the extrinsic (environmental) and the intrinsic (physiological and genomic) factors. There is a definite need and scope for evolving novel ways to stratify healthy individuals and develop a better understanding of normal phenotypic variation. Human physiolo… more
  • 0 comments
  • Submitted
  • 10 May 2013
Section: Storage and Databases Technical level: Advanced

Mahesh Kumar

Predictive Analytics in Social Media and Online Display Advertising

The last decade has seen unprecedented growth in the space of online advertising and digital media marketing. The new wave of social media (facebook, twitter, etc.) is making it easier than ever for the marketers to reach right customers at the right time with the right products and offers. However, the marketers, online advertising platforms, and other stakeholders need to be equipped with suita… more
  • 0 comments
  • Submitted
  • 10 May 2013
Section: Analytics and Visualization Technical level: Intermediate

Satnam Singh

Smart Analytics in Smartphones

The objective of this talk is to present the pros and cons of performing data mining on Smartphones vs. Server. I am intending to discuss state of the art use cases that can employ data mining and machine learning techniques on Smartphones. I will present a case study to demonstrate my thoughts. more
  • 0 comments
  • Confirmed & scheduled
  • 12 May 2013
Section: Analytics and Visualization Technical level: Intermediate

Peter Milne

Big Data Enlightenment

To de-mystify the nomenclature of Big Data and NoSQL more
  • 0 comments
  • Submitted
  • 14 May 2013
Section: Storage and Databases Technical level: Beginner

Ramesh Kumar M

Demystifying Big Data from Domain Name industry

This session will provide a comprehensive overview of worldwide Domain name registry & Domain name resolution data ,sources and tools.By attending this - Data enthusiasts will learn to look at Domain Name industry data from multiple perspectives , interpret key parameters , visualize/correlate , compare with their own datasets and gain valuable insights and trends for their respective fields. more
  • 0 comments
  • Submitted
  • 15 May 2013
Section: Analytics and Visualization Technical level: Intermediate

Enrico Berti

An introduction to Hue, the open source Hadoop UI

You will learn about Hue and its open source tools that allow real time exploration and analytics on Hadoop. more
  • 0 comments
  • Submitted
  • 15 May 2013
Section: Analytics and Visualization Technical level: Beginner

Prabhu Prakash Ganesh

Building large scale Analytics Platform

As companies try to test the waters of big data, they are bombarded by a lot of hype and diverse opinions, so it is easy to be overwhelmed. In this session, I plan to share our experience in building a large scale analytics platform, the choices we made and why. The intention is to help people make decisions for themselves or their organizations. more
  • 0 comments
  • Confirmed & scheduled
  • 15 May 2013
Section: Analytics and Visualization Technical level: Intermediate

Adethya Sudarsanan

Open Data Aero - An opportunity for the Airline Industry

Participants will get to know data’s perspective on airline industry. more
  • 0 comments
  • Submitted
  • 17 May 2013
Section: Storage and Databases Technical level: Intermediate

Erik Rose

What Happens When Firefox Crashes?

Mozilla tames a rabble of HBase, PostgreSQL, RabbitMQ, elasticsearch, and Python to chew through 50 Firefox crash reports each second. Explore the strengths and weaknesses of these tools and their consequent niches in the greater crash-catching system. Greet the complexities that emerge from the combination, and see how we engineer around them to keep our never-lost-a-crash record pristine. more
  • 1 comment
  • Confirmed & scheduled
  • 18 May 2013
Section: Storage and Databases Technical level: Intermediate

Varsha Joshi

A Billion Snapshots- Principles and Processes in the Census of India

The session will explain how the Census of India 2011 was designed, canvassed, processed, and analysed, to obtain a detailed picture of a huge population of great diversity and complexity. more
  • 1 comment
  • Confirmed & scheduled
  • 18 May 2013
Section: Analytics and Visualization Technical level: Beginner

Mahesh Tiyyagura

Find Near Duplicate records in your Data

Find customers with multiple mobile nrs in a dataset of 2Mn customer records more
  • 0 comments
  • Submitted
  • 19 May 2013
Section: Workshops Technical level: Intermediate

Rohit Chatter

Analytics: Make non-additive metrics additive using HBase & Bitmaps

Demonstrate use of Bitmaps on HBase. How it can enable non-additive metrics as additive in a very efficient method. more
  • 0 comments
  • Submitted
  • 19 May 2013
Section: Analytics and Visualization Technical level: Advanced

Rohit Chatter

Evaluate audience live use cases and Big Data Technology solutions

Have audience describe their problem and we propose a high level solution using Big Data technologies. This could be very useful as lots of people are looking for solutions to their problems and they don’t know how to use it. more
  • 0 comments
  • Submitted
  • 19 May 2013
Section: Workshops Technical level: Intermediate

Rohit Chatter

Product Demo: Analyze & Visualize Big Data right off the grid

Demonstrate Home grown BI tool that enables reporting and analytics directly off of grid. more
  • 0 comments
  • Submitted
  • 19 May 2013
Section: Product Demos Technical level: Advanced

Mahesh Tiyyagura

Streaming live-data to LCD screens in office (using opensource tools and Rs. 4300)

We’ll cover ideas, software and hardware that can be used to create and display beautiful real-time visualizations of different metrics/events on to LCD TV’s in your office. more
  • 0 comments
  • Submitted
  • 20 May 2013
Section: Analytics and Visualization Technical level: Beginner

Rajat Venkatesh

Analytics using Hadoop ecosystem on AWS

The workshop will go through the steps required to use the AWS ecosystem as an analytics backend. While we will discuss general design patterns - in many cases we will show examples using the Qubole platform. more
  • 0 comments
  • Confirmed & scheduled
  • 20 May 2013
Section: Workshops Technical level: Intermediate

Dr.S.Jayaprakash,Ph.D

Insurance Fraud Modeling & Business Intelligence Framework

We have successfully created a product on Insurance Fraud Modeling Framework backed by the robust Business Intelligence Analytics. Some of the USPs of this model are (a) Quick to Deploy within few weeks (b) Proven Statistical Models (c) Minimum of 10X ROI for Insurance companies (d) Deployment made possible at the fraction of budgets of IT Departments. The principles behind the framework is scala… more
  • 2 comments
  • Submitted
  • 21 May 2013
Section: Product Demos Technical level: Advanced

Dr.S.Jayaprakash,Ph.D

Grooming Geeks - Analytics & Application in Education

The role of education in the contemporary world is not mere knowledge but creating a ‘bent of mind’ for the students to adapt themselves to the competitive environment and grow. Not all the science students end up as scientists, similarly engineering students as engineers. A chemical engineer lands up as Java programmer and a science graduate works for a courier firm yet they still work exception… more
  • 1 comment
  • Submitted
  • 22 May 2013
Section: Analytics and Visualization Technical level: Beginner

Kaushik Paranjape

Real time analytics on data that spans 100s of GBs

Analytics of stats data is a frequently faced problem by organizations that have online businesses. Doing real time analytics on this data is very important to take right business decisions. Sokrati’s solution for this problem uses sharded columnar databases as data warehouse. Using this technology query results are returned in milli-seconds on data-sets that span 100s of GBs. more
  • 0 comments
  • Submitted
  • 23 May 2013
Section: Storage and Databases Technical level: Intermediate

Anshum Gupta

SolrCloud and NoSQL

If Search in NoSQL, NoSQL in Search or either of them separately interest you, this session could be for you. I’ll talk about how NoSQL datastores have evolved and are trying to provide search, Solr is moving towards them as a potentially great ‘big-data’ store. more
  • 2 comments
  • Confirmed & scheduled
  • 23 May 2013
Section: Storage and Databases Technical level: Intermediate

Ashwin Raghav Mohan Ganesh

Rnotify - A Scalable Application Level Distributed Filesystem Notifications Solution

Rnotify is a Distributed Filesystem Notification Solution. It helps application programmers use Inotify like abstractions to receive notifications from Distributed File Systems. This session will help programmers understand how one can build applications that require high throughout of notifications from File Systems. more
  • 0 comments
  • Submitted
  • 24 May 2013
Section: Storage and Databases Technical level: Advanced

Tim Davies

Linked Data - visions & implementations

This talk will provide an introduction to the history, principles and practice of Linked Open Data. more
  • 0 comments
  • Submitted
  • 24 May 2013
Section: Storage and Databases Technical level: Beginner

Tim Davies

Infrastructures and eco-systems for open data

This talk / workshop will provide an overview of developments in open data across the globe, and will outline some of the technical and organisational challenges to effectively publishing or making use of open data. more
  • 0 comments
  • Submitted
  • 24 May 2013
Section: Analytics and Visualization Technical level: Beginner

rajdeep dua

Introduction to Pivotal HD - Hadoop distribution with a SQL compliant query engine

This session will provide an overview of Pivotal HD - Next generation Hadoop distribution which integrates an MPP database running on top of HDFS. This session also will provide introduction to next release of Spring Data Hadoop which integrates with Hadoop 2.x more
  • 2 comments
  • Submitted
  • 27 May 2013
Section: Storage and Databases Technical level: Intermediate

Sushrut Bidwai

Data and Sales

Petabytes of data is available in public domain and more is created everyday. more
  • 0 comments
  • Submitted
  • 27 May 2013
Section: Analytics and Visualization Technical level: Beginner

Pramod N Haritsa

Making Sense of content in domain intense QA/Discussion Forums- A Text Mining Problem

StackOverFlow: Imagine a world without such user collaboration and moderation in maintaining and getting information from discussion forums. In this talk we’ll see how can one make sense of content in QA/Discussion Forums using a palette of text processing techniques. more
  • 2 comments
  • Submitted
  • 29 May 2013
Section: Analytics and Visualization Technical level: Beginner

Piyush

Need for “Lmetric” : the service for near real-time clickstream events and User behavior analysis

Session will cover following things: • Different type of data sources : Collection, trending and Analysis. • Capturing Events: Why structured data emitted from apps for machines is a better approach. • Centralized Collection of events in distributed environments • Easy access to collected data in consumable form for reporting and Analysis more
  • 0 comments
  • Submitted
  • 30 May 2013
Section: Storage and Databases Technical level: Beginner

Anand

RHadoop: Marrying analytics & large scale data processing

Why Hadoop is not analytics Why “Big Data” is not analytics more
  • 0 comments
  • Submitted
  • 02 Jun 2013
Section: Analytics and Visualization Technical level: Beginner

Shashi Shekhar Singh

Using ElasticSearch to build your Startup's Dashboard - Pros and Cons

At the end of the session, attendees should be able to understand the factors that they need to look at when evaluating whether to use ElasticSearch to build dashboard for their organizations. more
  • 0 comments
  • Submitted
  • 03 Jun 2013
Section: Storage and Databases Technical level: Beginner

Rahul Kulkarni

Build Products, Not Just Algorithms: 10 Examples from the Real World

Why are we able to build databases that can store petabytes, algorithms that seem to have an answer for everything and yet are not able to translate these into scalable, mass adopted big data products? more
  • 1 comment
  • Submitted
  • 03 Jun 2013
Section: Analytics and Visualization Technical level: Intermediate

Tapomay Dey

What did we gain out of using Mongodb, Redis and Mysql in a single system

Let’s talk about a multi-faceted system with unique data storage and processing challenges. One of the core challenges in SEM is to setup and maintain a marketing campaign. A good way to do so is to know the inventory in and out and be smart about it. With this system we aim at addressing such challenges in a scalable manner. more
  • 0 comments
  • Submitted
  • 03 Jun 2013
Section: Storage and Databases Technical level: Intermediate

Anand S

Advanced data analysis with Excel

Learn what you can do with Excel, really pushing it to the edge. more
  • 0 comments
  • Submitted
  • 03 Jun 2013
Section: Workshops Technical level: Advanced

Harshad Saykhedkar

Finding order in the chaos : machine learning for web text analytics using R

Participant will gain understanding of the following (through R), more
  • 1 comment
  • Confirmed & scheduled
  • 03 Jun 2013
Section: Workshops Technical level: Beginner

Rahul Kulkarni

Telling Twins Apart: A Cookie's Life And Other Stories From The Ad World

Ad targeting has evolved by leaps and bounds over the past 5 years - right from Google Analytics based remarketing on Google to custom audiences, lookalikes and partner categories on Facebook to audience buys across display networks. Targeting your ads precisely to the right audience personas is nothing but a big data problem with hundreds of thousands of targeting variable combinations possible.… more
  • 0 comments
  • Submitted
  • 03 Jun 2013
Section: Analytics and Visualization Technical level: Intermediate

Abinasha Karana

15 Billion value at risk computations in 187 milliseconds

The session will touch upon real-time analytic and search applications in Hadoop platform. It covers areas as HBase schema design, hadoop configuration, cluster design and real-time map-reduce pattern. more
  • 1 comment
  • Confirmed & scheduled
  • 04 Jun 2013
Section: Storage and Databases Technical level: Intermediate

Abhishek Vaid

Implementing Named-Entity-Recognizer on Twitter Data and Using it to Cluster Similar Tweets.

To understand the scope of extracting Real-Entities from tweets using freely available NER-Engines and POS-Tagging. more
  • 0 comments
  • Submitted
  • 04 Jun 2013
Section: Analytics and Visualization Technical level: Intermediate

Nikil Doshi

Tracking 2B parameters/month in real time - with just MySQL!

Whether a startup or a big organization, everyone faces the challenge of keeping track of data in real time. Data availability in real time has significant advantages, particularly in the search engine marketing and analytics space, where most decisions are made on-the-fly. However real-time systems often are assumed to need the latest and greatest in technology. more
  • 0 comments
  • Submitted
  • 04 Jun 2013
Section: Storage and Databases Technical level: Intermediate

michael gurstein

Riding a kneeling elephant: Community Informatics Bridging Data Into Communities

To explore how to link the technical work of software, open data, Big Data and other into the needs of local communities/villages. more
  • 0 comments
  • Submitted
  • 04 Jun 2013
Section: Workshops Technical level: Beginner

Ankur Nagar

Can big data fight poverty and corruption?

Big data is proving to be transformative in the private sector, though can it also help solve international development problems? more
  • 0 comments
  • Submitted
  • 05 Jun 2013
Section: Analytics and Visualization Technical level: Beginner

Srinivasan H Sengamedu

The art and science of exploiting near-similar text and images

Big Data, by its inherent nature, will have near-similar items. Identifying the repetitions and, even better, leveraging them to get your job done is both an art and science. The goal of this talk is to share some experiences with this and to get you excited about this. more
  • 0 comments
  • Submitted
  • 05 Jun 2013
Section: Analytics and Visualization Technical level: Intermediate

Anurag

Workshop: Learning ElasticSearch and using it to analyze Aadhaar's Public Datasets

You have a large data-set, commercial off-the-shelf hardware and, a project deadline that is looming. How do you manipulate the data and extract useful information? And, how steep is the learning curve? more
  • 0 comments
  • Confirmed & scheduled
  • 05 Jun 2013
Section: Workshops Technical level: Beginner
Vinayak Hegde

Vinayak Hegde

Data Analysis and Visualization using R

To provide an intermediate-to-advanced usage of R To do a deep-dive into different R modules using public data sets more
  • 0 comments
  • Confirmed & scheduled
  • 05 Jun 2013
Section: Workshops Technical level: Intermediate

prabhakar srinivasan

Audience Segmentation: Data-Science, Big-Data Architecture & Solution

The objective of this talk is to show how analytics techniques can be used to answer fundamental questions in the area of Audience Segmentation for the Broadcast domain. Various stakeholders like planners, channels, operators, media agencies, advertisers are very keen to know ‘who is watching TV’ so that the products, services, content and advertisements can be better customized and tailored to t… more
  • 0 comments
  • Submitted
  • 05 Jun 2013
Section: Analytics and Visualization Technical level: Intermediate

Mrinal Wadhwa

A hands-on introduction to Apache Hadoop

To help you get a fundamental understanding of: What is Hadoop? more
  • 0 comments
  • Submitted
  • 05 Jun 2013
Section: Workshops Technical level: Beginner

Viraj Paripatyadar

How to build a Recommender using Apache Mahout

This session covers creation of a recommender using Apache Mahout for a consumer Web application. After attending this session, application developers will be able to notice the need for using recommenders in their application and will be able to start planning and implementing them for their specific use cases. more
  • 3 comments
  • Submitted
  • 05 Jun 2013
Section: Analytics and Visualization Technical level: Beginner

Amit Kapoor

Telling visual stories with data

Data visualisation has enabled us to compress data and express them visually in many interesting new ways. It is often cited that we are trying to tell stories through them. Is that really the case? How can we ensure that the audience / consumer is able to RETAIN - RECALL - RETELL our data-driven stories. more
  • 1 comment
  • Submitted
  • 05 Jun 2013
Section: Analytics and Visualization Technical level: Beginner

Edouard Servan-Schreiber

MongoDB: An Overview

Participants of this workshop will have a sense of how easy it is to get started with MongoDB and why it makes their development faster and easier. They will also understand how MongoDB allows them to scale. more
  • 7 comments
  • Confirmed & scheduled
  • 20 Jun 2013
Section: Workshops Technical level: Beginner

Edouard Servan-Schreiber

Agility and Innovation vs IT: how new data platforms can overcome this neverending struggle

To explain to participants how NoSQL technology and private cloud practices can help in achieving scalable growth and fast time to market. more
  • 0 comments
  • Confirmed & scheduled
  • 20 Jun 2013
Section: Storage and Databases Technical level: Intermediate

Edouard Servan-Schreiber

Strategic advantages of MongoDB

This session will present the strategic advantages of MongoDB as an operational data store. more
  • 2 comments
  • Confirmed & scheduled
  • 20 Jun 2013
Section: Storage and Databases Technical level: Intermediate

Ajay Kelkar

Telling stories with data

Most people see Analytics as a technical & specialist role. I see Analytics as an intersection between technology & business. I will tell stories from data that I have seen over the last decade. Stories about how data has made a difference to a business & how you can go about doing the same. more
  • 0 comments
  • Confirmed
  • 22 Jun 2012
Technical level: Intermediate Session type: Discussion

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more