The Fifth Elephant 2012

Finding the elephant in the data.

What are your users doing on your website or in your store? How do you turn the piles of data your organization generates into actionable information? Where do you get complementary data to make yours more comprehensive? What tech, and what techniques?

The Fifth Elephant is a two day conference on big data.

Early Geek tickets are available from fifthelephant.doattend.com.

The proposal funnel below will enable you to submit a session and vote on proposed sessions. It is a good practice introduce yourself and share details about your work as well as the subject of your talk while proposing a session.

Each community member can vote for or against a talk. A vote from each member of the Editorial Panel is equivalent to two community votes. Both types of votes will be considered for final speaker selection.

It’s useful to keep a few guidelines in mind while submitting proposals:

  1. Describe how to use something that is available under a liberal open source license. Participants can use this without having to pay you anything.

  2. Tell a story of how you did something. If it involves commercial tools, please explain why they made sense.

  3. Buy a slot to pitch whatever commercial tool you are backing.

Speakers will get a free ticket to both days of the event. Proposers whose talks are not on the final schedule will be able to purchase tickets at the Early Geek price of Rs. 1800.

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Ramesh Perumalsamy

@rameshp87

Build your own Real Time Analytics and Visualization, Enable Complex Event Processing, Event Patterns and Aggregates

Submitted Jun 22, 2012

This talk is not about usage of existing tools or frameworks. This is about building a platform for Real Time Analytics, Real time visualization and Real time Complex Event Processing triggered by Events, Event Patterns and Event Aggregates.

Outline

Traditionally people use databases or data-warehouses or even Hadoop and Map Reduce to perform batch analytics. However real time analytics, real time visualization and complex event processing require drastically different architecture, storage and paradigms.

In this talk we will share a model for accomplishing the above using:

Elastic Search: To push large data sets into sharded lucene index. This index can be in memory, partially in memory and partially on disk etc. This is key to real time data.

statsd: Aggregator (provides aggregates on the basic events). This is key to continuous ETL and continuous analytics.

Reverse Pump: The reverse pump, pushes back aggregate information into Elastic Search

Pattern Recognition: This is similar to looking for regular expressions in Perl. We look for patterns in the event stream and match patterns. Note the sliding time window over which is applies is finite and limited by memory as are the number of pattern matchers. This system uses a reverse index and RETE techniques.

Graphite: Basic Visualization (includes simple realtime visualization)

Notification: These will be triggered on matches of static rules or complex pattern match on base event/aggregate events or patterns matched

Extras:
MySQL binlog parser to observe and analyze events (BinLogParser)
Depict the mood of Mysql with music. Make it sing!

Speaker bio

Ramesh Perumalsamy & Vishnu Rao are part of the Supply Chain Technologies(SCT) Platform team at Flipkart. We build SOA systems and event processing and triggers notifications depending on patterns for India’s largest ECommerce entity Flipkart.

The day job involves prototyping to delivering new solutions and technologies to provide productivity multipliers to Flipkart Supply Chain products.

Ramesh loves solving problems, the command line, experimenting with Open source solutions and much more. And secretly saves the world from Super Villains (Like some of his colleagues).

Vishnu is our In-house Mysql Medical Officer. He runs a flipkart internal blog called - MAS*H (Mysql Army surgical hospital) where Mysql queries and Mysql instances alike are treated with love and care :) .A lego freak. He doodles @ http://doodle-vishnu.blogspot.com/

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more