RHadoop: Marrying analytics & large scale data processing
- Why Hadoop is not analytics
- Why “Big Data” is not analytics
- Quick overview of performing analytics over a large data set (without RHadoop)
- Easing the pain with RHadoop: RHadoop tutorial session
This session will quickly segue from helping the audience realise the difference between “big data” & analytics into a hands-on about writing using R for performing analytics on a large dataset (using naive distributed computing) followed by a hands-on session on using RHadoop.
- Familiarity with R would help though is not mandatory
- Some familiarity with statistics would help map jargon
Anand Krishnaswamy is a developer at ThoughtWorks. His background spans filesystem development, storage management solution development, class library & compiler design & development & recently, web-app development. His interests in data analytics is self-developed. His other interests range from photography & cooking to writing & painting.