The Fifth Elephant 2014

A conference on big data and analytics

Previous proposal

 Big data in finance

Next proposal

De-dup on Hadoop

Dr. Hadoop – Diagnose your Hadoop Jobs

Submitted by Chandraprakash Bhagtani (@cpbhagtani) on Jun 13, 2014

Section: Crisp talk Technical level: Intermediate Status: Confirmed & scheduled

Abstract

Have you faced a problem where you run a job or query on hadoop, which runs very slow, and you have no clue why? You look at your job details on jobtracker and get confused with hundreds of counters and configurations? You really don’t know how to make sense out of it. This is a very common challenge for hadoop beginners specially the analysts or the people coming from RDBMS world. This talk is about the solution that we have built to address this problem.

Outline

This talk is about a tool that we have developed within intuit – Dr. hadoop, which analyzes your job, identifies the areas of improvements and gives recommendations to improve its performance. It collects all the history logs, counters and configuration of your job, applies a set of rules and provides recommendations with suggested values and severity.

Speaker bio

I am a hadoop performance engineer@Intuit. I have been working on hadoop performance for more than 3 years.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('You need to be a participant to comment.') }}

{{ formTitle }}
{{ gettext('Post a comment...') }}
{{ gettext('New comment') }}

{{ errorMsg }}