Big Data Analytics with R

Jul 2013

8 Mon

9 Tue

10 Wed

11 Thu 09:30 AM – 04:30 PM IST

12 Fri 10:15 AM – 05:30 PM IST

13 Sat 10:15 AM – 05:30 PM IST

14 Sun

Nimhans Convention Centre

Big Data Analytics with R

Submitted Apr 23, 2013

Section: Analytics and Visualization Technical level: Intermediate

An attendee would understand High Performance and Parallel Computing landscape in R. This area in R is undergoing rapid change and objective of this session is to provide insight into various active contributions in this area. In the session, we would also delve deeper into analyzing moderately large data sets which presents huge opportunity today as a solution to “everything in memory” challenge in R without getting into huge infrastructure/software setup or costs.

Outline

When we hear about Parallelism and Big Data Processing in R, we think of Grid Computing or Parallel computing with Hadoop or Revolution Analytics which requires infrastructure setup and typically skillset/programming beyond R. These may be required for analyzing really big data sets (terabytes+). However for handling data up to few hundreds of GB, there are packages like ff and bigmemory in R, which can solve large number of use cases without the need of additional memory or hardware setup. These techniques though useful are not very well known and are primary focus of this session.

Speaker bio

Neeta Pande, Data Architect, Intuit: Neeta has about13 years of experience in Business Intelligence and Analytics. She has extensive experience architecting and engineering data analytics in BFSI, manufacturing and personal finance domain. Her recent focus area includes usage behavior analysis, real time customer behavior prediction/contextual personalization service platforms and designing scalable and sustainable technology platform for solving big data problems.

The Fifth Elephant 2013