BDAS, the Berkeley Data Analytics Stack

Jul 2014

21 Mon

22 Tue

23 Wed 09:30 AM – 05:00 PM IST

24 Thu 09:45 AM – 05:00 PM IST

25 Fri 08:30 AM – 07:15 PM IST

26 Sat 08:30 AM – 07:15 PM IST

27 Sun

NIMHANS Convention Centre, Bangalore

All submissions

Previous Next

BDAS, the Berkeley Data Analytics Stack

Submitted Apr 15, 2014

Section: Crisp talk Technical level: Beginner

This talk is an introduction to the features about the next generation, open source data analysis stack developed by UC Berkeley AMPLab.

Outline

BDAS is made up of multiple components and compatible with the Hadoop stack

Spark, a high speed cluster computing system with an ability to perform computations in memory.
Mesos, a cluster manager that provides efficient resource isolation and sharing across distributed applications
Tachyon, a fault tolerant distributed file system enabling reliable file sharing at memory-speed across cluster frameworks
MLBase, a platform for implementing and consuming Machine Learning techniques at scale
Shark, a port of Apache Hive onto Spark that is compatible with existing Hive warehouses and queries
Spark Streaming extends Spark to build scalable fault-tolerant streaming applications
GraphX, extends Spark with an ability to deal with structured graph data

Requirements

Participants should have basic understanding about Big Data concepts and Hadoop.

Working on software for more than 15 years, with a focus towards improving performance and optimization of applications and algorithms. Interests include Big Data, parallelism, algorithm optimization etc...

All submissions

Previous Next

Comments

Jul 2014

21 Mon

22 Tue

23 Wed 09:30 AM – 05:00 PM IST

24 Thu 09:45 AM – 05:00 PM IST

25 Fri 08:30 AM – 07:15 PM IST

26 Sat 08:30 AM – 07:15 PM IST

27 Sun

Hosted by

The Fifth Elephant

Jumpstart better data engineering and AI futures

The Fifth Elephant 2014

BDAS, the Berkeley Data Analytics Stack

Outline

Requirements

Speaker bio

Comments