The Fifth Elephant 2012

Finding the elephant in the data.

What are your users doing on your website or in your store? How do you turn the piles of data your organization generates into actionable information? Where do you get complementary data to make yours more comprehensive? What tech, and what techniques?

The Fifth Elephant is a two day conference on big data.

Early Geek tickets are available from

The proposal funnel below will enable you to submit a session and vote on proposed sessions. It is a good practice introduce yourself and share details about your work as well as the subject of your talk while proposing a session.

Each community member can vote for or against a talk. A vote from each member of the Editorial Panel is equivalent to two community votes. Both types of votes will be considered for final speaker selection.

It’s useful to keep a few guidelines in mind while submitting proposals:

  1. Describe how to use something that is available under a liberal open source license. Participants can use this without having to pay you anything.

  2. Tell a story of how you did something. If it involves commercial tools, please explain why they made sense.

  3. Buy a slot to pitch whatever commercial tool you are backing.

Speakers will get a free ticket to both days of the event. Proposers whose talks are not on the final schedule will be able to purchase tickets at the Early Geek price of Rs. 1800.

Hosted by

All about data science and machine learning



Big Data in Real Time: Processing the Social Web within a tolerable elapsed time

Submitted Jun 14, 2012

Understanding the challenges posed while dealing with ‘Big Data’, in terms of time, processing and storage with special emphasis on social media intelligence systems which require the crawling and processing terabytes of data within a ‘reasonable’ time frame.


The extreme virality of Social data makes near real-time alerting and response systems essential to the success of Social Media monitoring and engagement platforms. From live storage systems, modified caching mechanisms and combinations of SQL and NoSql systems, we discuss new requirements, challenges and limitations that have forced us to combine, modify and at times rebuild several of these systems.



Speaker bio

Harish is the CTO at Webfluenz Pte Ltd and Director at 4am Design and Technology Labs.

Harish drives the creation of proprietary Webfluenz Intelligence and Data Gathering systems by combining established computer science concepts with cutting edge and continuously evolving technologies.

In the past Harish has worked on various intelligent search systems in different environments.


{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

All about data science and machine learning