The Fifth Elephant 2015

A conference on data, machine learning, and distributed and parallel computing

Up next

Apache Tez - Present and Future

RB

Rajesh Balamohan

@None

Talk about the present and future of Apache Tez.

Outline

Apache Tez is a framework designed to build data-flow driven processing runtimes. Tez provides a scaffolding and library components that can be used to quickly build scalable and efficient data-flow centric engines. This talk will cover the journey of Tez from being a concept in the Apache Incubator to becoming the cornerstone of well-known projects such as Apache Hive and Apache Pig of the Hadoop ecosystem. I will then move on to the future of Tez on how it is improving to make it easier for data processing applications to be built to run in single-digit seconds and/or to scale to petabytes of data.

Speaker bio

Rajesh Balamohan has been working on Hadoop for last couple of years and recently has been concentrating on Tez performance at scale.

Comments