Big data analysis with Apache Spark

Jul 2015

13 Mon

14 Tue

15 Wed

16 Thu 08:30 AM – 06:35 PM IST

17 Fri 08:30 AM – 06:30 PM IST

18 Sat 09:00 AM – 06:30 PM IST

19 Sun

NIMHANS Convention center

Big data analysis with Apache Spark

Submitted May 6, 2015

Section: Workshop Technical level: Beginner

Apache Spark is a new upcoming big data processing engine. It’s getting popular for it’s of ease of use and it’s unification of different big data work load. The objective this workshop is to get your hands dirty with it.

Outline

We will go over the following in the workshop

Evolution of Big data systems
Why Apache Spark?
Apache Spark architecture
Installing Spark
Working with Spark REPL
Working with RDD’s
RDD Map/Reduce
RDD examples in Scala
Using Spark for streaming data
Spark SQL

Requirements

General Programming Knowledge
General knowledge on big data systems like Hadoop

Need a laptop with Spark installed. I will share specific steps for installation near to the workshop.

Speaker bio

Madhukara phatatak is a Bigdata consultant @ Datamantra. He has been actively working in Hadoop,Spark and its ecosystem projects from last 5 years.

He was lead developer of Nectar, a ML library for hadoop.He also contributed to hadoop source code to improve cyclic checks in Jobcontrol api.With raise of Apache Spark, he with his team has open sourced courseera machine learning course examples on spark here. He blogs on spark here. Also he runs a Spark meetup group in Bangalore.

The Fifth Elephant 2015