The Fifth Elephant 2015

A conference on data, machine learning, and distributed and parallel computing

Apache Tez - Present and Future

Submitted by Rajesh Balamohan on Monday, 15 June 2015

Section: Full Talk Technical level: Intermediate

View proposal in schedule

Abstract

Talk about the present and future of Apache Tez.

Outline

Apache Tez is a framework designed to build data-flow driven processing runtimes. Tez provides a scaffolding and library components that can be used to quickly build scalable and efficient data-flow centric engines. This talk will cover the journey of Tez from being a concept in the Apache Incubator to becoming the cornerstone of well-known projects such as Apache Hive and Apache Pig of the Hadoop ecosystem. I will then move on to the future of Tez on how it is improving to make it easier for data processing applications to be built to run in single-digit seconds and/or to scale to petabytes of data.

Speaker bio

Rajesh Balamohan has been working on Hadoop for last couple of years and recently has been concentrating on Tez performance at scale.

Comments

Login with Twitter or Google to leave a comment