Lessons Learned : Scaling Hadoop and BigData in Cloud (Amazon EMR )

Jul 2012

23 Mon

24 Tue

25 Wed

26 Thu

27 Fri 09:30 AM – 05:30 PM IST

28 Sat 09:30 AM – 05:00 PM IST

29 Sun

Nimhans Convention Centre, Bangalore

All submissions

Previous Next

Lessons Learned : Scaling Hadoop and BigData in Cloud (Amazon EMR )

Submitted Mar 20, 2012

Section: Big Data Infrastructure & Processing Technical level: Intermediate Session type: Lecture

High level technology and business perspective around BigData including how to and why leverage cloud based platforms like Amazon EMR along with Map Reduce for data analysis.

Expect to learn concepts, insights, challenges in problem solving, scaling, dealing with data and performance tuning.

Outline

I will be talking about Hadoop, Map Reduce in general and how to leverage Cloud based platforms like Amazon EMR for Hadoop Map Reduce jobs. Will be sharing lessons learned from managing hyper scale production Hadoop clusters and tuning for performance in general – Think 68400 GB RAM, 26000 CPUs and 1700000 GB Disks :)

This talk is aimed at providing insights and challenges while building large scale data analysis platforms using Hadoop and technical challenges involved in scaling algorithms to data storage to json parsers to in-memory data stores to managing 100’s of jobs.

Requirements

Basic understanding of Hadoop will be good.

Speaker bio

An Engineer (aka CTO) working at Kuliza on Platforms, Cloud and BigData. Previously at #Startups #GizaPage #Trilogy #eFoodlet #Michelin.

Have worked on large scale web platforms and distributed systems for enterprises and consumer web. At Kuliza, we work on PetaByte scale EMR clusters with 100’s of nodes running Hadoop for data analysis. Our cloud team runs 500+ production servers with every possible #technical #stack #configuration out there!