Apache Tez: Accelerating Hadoop Data Pipelines

Jul 2014

21 Mon

22 Tue

23 Wed 09:30 AM – 05:00 PM IST

24 Thu 09:45 AM – 05:00 PM IST

25 Fri 08:30 AM – 07:15 PM IST

26 Sat 08:30 AM – 07:15 PM IST

27 Sun

NIMHANS Convention Centre, Bangalore

All submissions

Previous Next

This submission has been added to the schedule

Apache Tez: Accelerating Hadoop Data Pipelines

Submitted May 23, 2014

Section: Full talk Technical level: Beginner

Apache Tez is a DAG execution engine which exists as a super-set of traditional Map Reduce. Tez designed as a replacement computational model for nearly everything that currently uses map-reduce.

The talk is meant to be an introduction to Tez, its architecture and its evolution from traditional map-reduce.

Outline

Apache Tez is a modern data processing engine designed for YARN on Hadoop 2. Tez aims to provide high performance and efficiency out of the box, across the spectrum of low latency queries and heavy-weight batch processing. With a clear separation between the logical app layer and the physical data movement layer, Tez is designed from the ground up to be a platform on top of which a variety of domain specific applications can be built. Tez has pluggable control and data planes that allow users to plug in custom data transfer technologies, concurrency-control and scheduling policies to meet their exact requirements. The talk will elaborate on these features via real use cases from early adopters like Hive, Pig and Cascading.

Speaker bio

Gopal works on performance problems in hadoop ecosystem. He’s involved with the Stinger effort from Hortonworks to improve the SQL data access layers in Hadoop. He is a contributor to the Apache Hive project and a committer for the Apache Tez project.

Comments

Jul 2014

21 Mon

22 Tue

23 Wed 09:30 AM – 05:00 PM IST

24 Thu 09:45 AM – 05:00 PM IST

25 Fri 08:30 AM – 07:15 PM IST

26 Sat 08:30 AM – 07:15 PM IST

27 Sun

Hosted by

The Fifth Elephant

Jump starting better data engineering and AI futures

The Fifth Elephant 2014

Apache Tez: Accelerating Hadoop Data Pipelines

Outline

Speaker bio

Links

Comments