The Fifth Elephant 2012

Finding the elephant in the data.

What are your users doing on your website or in your store? How do you turn the piles of data your organization generates into actionable information? Where do you get complementary data to make yours more comprehensive? What tech, and what techniques?

The Fifth Elephant is a two day conference on big data.

Early Geek tickets are available from fifthelephant.doattend.com.

The proposal funnel below will enable you to submit a session and vote on proposed sessions. It is a good practice introduce yourself and share details about your work as well as the subject of your talk while proposing a session.

Each community member can vote for or against a talk. A vote from each member of the Editorial Panel is equivalent to two community votes. Both types of votes will be considered for final speaker selection.

It’s useful to keep a few guidelines in mind while submitting proposals:

  1. Describe how to use something that is available under a liberal open source license. Participants can use this without having to pay you anything.

  2. Tell a story of how you did something. If it involves commercial tools, please explain why they made sense.

  3. Buy a slot to pitch whatever commercial tool you are backing.

Speakers will get a free ticket to both days of the event. Proposers whose talks are not on the final schedule will be able to purchase tickets at the Early Geek price of Rs. 1800.

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Kushwaha Manish Kaushal

@manish

Handle BigData Analytics with Hadoop eco-system (Hadoop, HBase, Hive, WorkFlow)

Submitted Jun 29, 2012

To give insights of the problems and solutions if you are working on very high volume of data (~ 330 TB of data). Problems involves with Hardware Infrastructure, and in functional treatment. BigData problem increases further if your data collection size going up by 10% per month. Solution through Hadoop eco-system.

Outline

We at Pubmatic are handling more then 330 TB of data using Apache Hadoop Eco-System. Handaled many of burning issues with Hadoop itself using available open source. By combining many components of Hadoop we have developed our “On the fly Analytic” platform. This address many analytic functional space.

I would like to cover how we are able to tackle efficiently huge set of data at our company with no cost on software using commodity servers. What are day to day problems in handling a big cluster of Hadoop and generic solution on those problems. Few use-case of analytic which requires huge data churning and joins between different set of data.

Speaker bio

Manish Kaushal, Principal Architect, Pubmatic.
Handle Analytics initiative and hadoop eco-system.
Past:
Sr. R&D Engineer at Nokia Siemens Network, Handled AdServer projects which required large data handling.
Sr. Lead Engineer, Motorola, Handled various telcom, and retail loyalty programs, All of those programs required handling of BigData.

Links

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more