The Fifth Elephant 2012

Finding the elephant in the data.

What are your users doing on your website or in your store? How do you turn the piles of data your organization generates into actionable information? Where do you get complementary data to make yours more comprehensive? What tech, and what techniques?

The Fifth Elephant is a two day conference on big data.

Early Geek tickets are available from fifthelephant.doattend.com.

The proposal funnel below will enable you to submit a session and vote on proposed sessions. It is a good practice introduce yourself and share details about your work as well as the subject of your talk while proposing a session.

Each community member can vote for or against a talk. A vote from each member of the Editorial Panel is equivalent to two community votes. Both types of votes will be considered for final speaker selection.

It’s useful to keep a few guidelines in mind while submitting proposals:

  1. Describe how to use something that is available under a liberal open source license. Participants can use this without having to pay you anything.

  2. Tell a story of how you did something. If it involves commercial tools, please explain why they made sense.

  3. Buy a slot to pitch whatever commercial tool you are backing.

Speakers will get a free ticket to both days of the event. Proposers whose talks are not on the final schedule will be able to purchase tickets at the Early Geek price of Rs. 1800.

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Milind Bhandarkar

@techmilind

Big Data Analytics with Greenplum Unified Analytics Platform

Submitted May 4, 2012

In this talk, attendees will learn various use cases of Big Data analytics, and how to solve them Greenplum’s Unified Analytics platform that combines Greenplum Chorus, a collaboration platform for data science teams; Greenplum Database, a powerful MPP database; and Greenplum HD, a distribution of Apache Hadoop.

Outline

Increase in the volume, variety and velocity of data in the enterprise has given rise to data analytics infrastructure that is markedly different from the traditional data warehousing platforms. In addition, deriving value from the available data has prompted a new discipline called data science, and has resulted in different workloads for these platforms. In this talk, I will outline Greenplum’s vision of the Unified Analytics Platform (UAP). Greenplum UAP combines Chorus, a collaboration platform for data science and analytics teams; Greenplum Database, a massively parallel database for structured data processing; and Greenplum HD, our distribution of Apache Hadoop. I will outline several use-cases that involved both structured and unstructured data analytics, and and describe how these are solved by our customers using Greenplum UAP.

Requirements

Knowledge of SQL, and familiarity of Apache Hadoop ecosystem is assumed.

Speaker bio

Dr. Milind Bhandarkar was the founding member of the team at Yahoo! that took Apache Hadoop from 20-node prototype to datacenter-scale production system, and has been contributing and working with Hadoop since version 0.1.0. He started the Yahoo! Grid solutions team focused on training, consulting, and supporting hundreds of new migrants to Hadoop. Parallel programming languages and paradigms has been his area of focus for over 20 years. He worked at the Center for Development of Advanced Computing (C-DAC), National Center for Supercomputing Applications (NCSA), Center for Simulation of Advanced Rockets, Siebel Systems, Pathscale Inc. (acquired by QLogic), Yahoo! and Linkedin. Currently, he is the Chief Architect at Greenplum Labs, a division of EMC.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more