Building Spark as Service in Cloud using YARN

Jul 2015

13 Mon

14 Tue

15 Wed

16 Thu 08:30 AM – 06:35 PM IST

17 Fri 08:30 AM – 06:30 PM IST

18 Sat 09:00 AM – 06:30 PM IST

19 Sun

NIMHANS Convention center

All submissions

Previous Next

Building Spark as Service in Cloud using YARN

Submitted May 25, 2015

Section: Full Talk Technical level: Intermediate

Apache Spark is rapidly taking off in popularity as a new data processing framework. However - it can be daunting to install and run it. In this talk we will talk about the challenges of running Spark in the Cloud using YARN and how we have built Spark as a Service. We will also discuss about our learnings from building and operating this service in the AWS cloud and future directions.

Outline

We will talk about:

Self managed spark clusters in cloud
Using spot nodes in aws cloud
Autoscaling spark application
Running spark sql queries against existing hive metastore
End user APIs and user interface for spark as service offering

Requirements

Basic knowledge of spark, map reduce, cloud.

Speaker bio

Bharath Bhushan: is working as Software Engg in Qubole. He is currently working on Spark offering. Earlier he has worked with Google (Page Speed team) and citrix.

Rajat Gupta: is working as Software Engg with Qubole. He is currently working on Spark offering. Earlier he has worked with Calypto and Cypress Semiconductors.

All submissions

Previous Next

Comments

Jul 2015

13 Mon

14 Tue

15 Wed

16 Thu 08:30 AM – 06:35 PM IST

17 Fri 08:30 AM – 06:30 PM IST

18 Sat 09:00 AM – 06:30 PM IST

19 Sun

Hosted by

The Fifth Elephant

Jumpstart better data engineering and AI futures

The Fifth Elephant 2015

Building Spark as Service in Cloud using YARN

Outline

Requirements

Speaker bio

Comments