Jul 2016
25 Mon
26 Tue
27 Wed
28 Thu 08:30 AM – 06:25 PM IST
29 Fri 08:30 AM – 06:15 PM IST
30 Sat 08:45 AM – 05:00 PM IST
31 Sun 08:15 AM – 06:00 PM IST
Jul 2016
25 Mon
26 Tue
27 Wed
28 Thu 08:30 AM – 06:25 PM IST
29 Fri 08:30 AM – 06:15 PM IST
30 Sat 08:45 AM – 05:00 PM IST
31 Sun 08:15 AM – 06:00 PM IST
Akshay Rai
Hadoop is a framework that facilitates the distributed storage and processing of large distributed datasets involving a number of components interacting with each other. Because of its large and complex framework, it is important to make sure every component performs optimally. While we can always optimize the underlying hardware resources, network infrastructure, OS, and other components of the stack, only users have control over optimizing the jobs that run on the cluster.
Dr. Elephant is a tool for the users of Hadoop to help them understand, analyse and tune their Hadoop/Spark applications easily, thus improving their productivity and the cluster’s efficiency. It analyzes the Hadoop and Spark jobs using a set of pluggable, configurable, rule-based heuristics that provide insights on how a job performed, and then uses the results to make suggestions about how to tune the job to make it perform more efficiently.
Phase 1: I’ll share the experience at Linkedin in optimizing the user jobs, the challenges we faced and how a simple self serve tool like Dr. Elephant helped overcome these challenges.
Phase 2: I’ll share how we integrated such a tool into our developer lifecycle and encouraged them to optimize the jobs with minimal support from the hadoop experts.
Phase 3: This phase will involve discussions about the tool, how it analyses the job by gathering all the diverse information, how to write custom heuristics and plug them into Dr. Elephant, comparing and analysing job executions etc.
Akshay Rai is an engineer at Linkedin working for the Hadoop development team. He has been working on Dr. Elephant for more than a year and has worked extensively to help open source this tool. Since the open source announcement last week, he has been actively engaging in discussions with the community and leading this project.
Jul 2016
25 Mon
26 Tue
27 Wed
28 Thu 08:30 AM – 06:25 PM IST
29 Fri 08:30 AM – 06:15 PM IST
30 Sat 08:45 AM – 05:00 PM IST
31 Sun 08:15 AM – 06:00 PM IST
Hosted by
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}