The Fifth Elephant 2015

A conference on data, machine learning, and distributed and parallel computing

Apache Tez - Present and Future

Submitted by Rajesh Balamohan on Monday, 15 June 2015

videocam_off

Technical level

Intermediate

Section

Full Talk

Status

Confirmed & Scheduled

View proposal in schedule

Vote on this proposal

Login to vote

Total votes:  +3

Objective

Talk about the present and future of Apache Tez.

Description

Apache Tez is a framework designed to build data-flow driven processing runtimes. Tez provides a scaffolding and library components that can be used to quickly build scalable and efficient data-flow centric engines. This talk will cover the journey of Tez from being a concept in the Apache Incubator to becoming the cornerstone of well-known projects such as Apache Hive and Apache Pig of the Hadoop ecosystem. I will then move on to the future of Tez on how it is improving to make it easier for data processing applications to be built to run in single-digit seconds and/or to scale to petabytes of data.

Speaker bio

Rajesh Balamohan has been working on Hadoop for last couple of years and recently has been concentrating on Tez performance at scale.

Comments

Login with Twitter or Google to leave a comment