The Fifth Elephant 2014

A conference on big data and analytics

Dr. Hadoop – Diagnose your Hadoop Jobs

Submitted by Chandraprakash Bhagtani (@cpbhagtani) on Friday, 13 June 2014

Section: Crisp talk Technical level: Intermediate

View proposal in schedule


Have you faced a problem where you run a job or query on hadoop, which runs very slow, and you have no clue why? You look at your job details on jobtracker and get confused with hundreds of counters and configurations? You really don’t know how to make sense out of it. This is a very common challenge for hadoop beginners specially the analysts or the people coming from RDBMS world. This talk is about the solution that we have built to address this problem.


This talk is about a tool that we have developed within intuit – Dr. hadoop, which analyzes your job, identifies the areas of improvements and gives recommendations to improve its performance. It collects all the history logs, counters and configuration of your job, applies a set of rules and provides recommendations with suggested values and severity.

Speaker bio

I am a hadoop performance engineer@Intuit. I have been working on hadoop performance for more than 3 years.


  • Kiran Jonnalagadda (@jace) Reviewer 5 years ago

    Is this tool available to the public?

  • Chandraprakash Bhagtani (@cpbhagtani) Proposer 5 years ago

    No, this is internal to Intuit and still in beta.

  • Mayank Jaiswal (@msjaiswal) 5 years ago

    Does Intuit have plans to make this open source or to make the tool public in near future. Also, will we be able to use the tool with Aws Elastic Mapreduce Infrastructure ?

  • Chandraprakash Bhagtani (@cpbhagtani) Proposer 5 years ago

    This tool is still beta and under development. We have no plans to make it public as of now. Yes we should be able to use this with EMR.

Login with Twitter or Google to leave a comment