The Fifth Elephant 2015

A conference on data, machine learning, and distributed and parallel computing

Rajesh Balamohan

Apache Tez - Present and Future

Submitted Jun 15, 2015

Talk about the present and future of Apache Tez.


Apache Tez is a framework designed to build data-flow driven processing runtimes. Tez provides a scaffolding and library components that can be used to quickly build scalable and efficient data-flow centric engines. This talk will cover the journey of Tez from being a concept in the Apache Incubator to becoming the cornerstone of well-known projects such as Apache Hive and Apache Pig of the Hadoop ecosystem. I will then move on to the future of Tez on how it is improving to make it easier for data processing applications to be built to run in single-digit seconds and/or to scale to petabytes of data.

Speaker bio

Rajesh Balamohan has been working on Hadoop for last couple of years and recently has been concentrating on Tez performance at scale.


{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

All about data science and machine learning