Workflow Schedulers: The Heart Beat of a Big Data Stack
Submitted by Rajat Venkatesh (@vrajat) on Friday, 26 April 2013
Storage and Databases
With use cases of how Qubole customers use the Scheduler product, I'll talk about:
Salient features required from a Big Data Scheduler.
Comparison of open source schedulers like Apache Oozie, Azkaban (Linkedin), Luigi (Spotify) and Chronos (Airbnb).
Experiences with using Apache Oozie .
At Qubole, we use Apache Oozie as the scheduler. I'll address which features are more important than others based on the usage of the Scheduler product in the Qubole platform. With an insight into the salient features, I'll compare other open source schedulers such as Azkaban (Linkedin), Luigi (Spotify) and Chronos (Airbnb). This information will provide a platform for attendees to make more informed decisions on which of these technologies to choose to schedule ETL and reporting processes on top of Hadoop.
Rajat Venkatesh is a developer at Qubole, a company that provides data analysis tools on the cloud. He is responsible for the Scheduler product at Qubole. Before Qubole, he worked as a database kernel developer at Vertica - a big data analytics platform.