The Fifth Elephant 2012

Finding the elephant in the data.

What are your users doing on your website or in your store? How do you turn the piles of data your organization generates into actionable information? Where do you get complementary data to make yours more comprehensive? What tech, and what techniques?

The Fifth Elephant is a two day conference on big data.

Early Geek tickets are available from fifthelephant.doattend.com.

The proposal funnel below will enable you to submit a session and vote on proposed sessions. It is a good practice introduce yourself and share details about your work as well as the subject of your talk while proposing a session.

Each community member can vote for or against a talk. A vote from each member of the Editorial Panel is equivalent to two community votes. Both types of votes will be considered for final speaker selection.

It’s useful to keep a few guidelines in mind while submitting proposals:

  1. Describe how to use something that is available under a liberal open source license. Participants can use this without having to pay you anything.

  2. Tell a story of how you did something. If it involves commercial tools, please explain why they made sense.

  3. Buy a slot to pitch whatever commercial tool you are backing.

Speakers will get a free ticket to both days of the event. Proposers whose talks are not on the final schedule will be able to purchase tickets at the Early Geek price of Rs. 1800.

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Rahul Kulkarni

Crunching big data, Google scale

Submitted Jul 9, 2012

Insights from Google’s experience on handling big data, overview of techniques and products (with case studies) on crunching big data.

Outline

Whether you are an e-commerce startup or a genomics research lab or plan to run products at the scale of Gmail, Youtube or Adwords, you generate gigabytes of data every day if not every hour. Your storage requirements run into tera or peta bytes and you may need thousands if not hundreds of thousands of CPU cores to process that data. At Google, we have developed several in-house tools and techniques to be able to process data at scale. Recently we made several of these tools available externally. In this talk we will go over some of our learnings on big data, and discuss techniques with case studies for crunching big data.

Speaker bio

Rahul Kulkarni joined Google in 2006 as its first product manager in India. He currently manages a cross functional team leading Google maps and local efforts in India. He has built and led teams around cloud computing, Google Apps platform, Orkut, OpenSocial, Google Finance and ads at Google. Prior to joining Google, Rahul led new product development efforts at National Instruments Corp, Austin in the areas of design, prototyping and deployment of high speed control systems.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more