Hands-on introduction to Pig
Submitted by Prashanth Babu (@p7h) on Sunday, 10 June 2012
Big Data Infrastructure & Processing
- Pig is a high-level platform for creating MapReduce programs used with Hadoop for analyzing Big Data.
- This is a 2 hour workshop on intro to Pig.
- Workshop aims at live-coding introductory session for analyzing Big Data using Pig.
This workshop will include discussion on:
- Basics of Hadoop
- Basics of Pig and PigLatin
- Pig vs MapReduce
- Pig vs SQL
- Live-coding session on Pig for analyzing huge sample data.
- Checking the visualization of Pig MapReduce Jobs with Twitter Ambrose
- Basic understanding of Hadoop, HDFS and MapReduce.
- Laptop with VMware Player or Oracle VirtualBox installed.
- Please download Cloudera Demo VM from https://ccp.cloudera.com/display/SUPPORT/Cloudera's+Hadoop+Demo+VM
- Alternatively, a USB flash drive will be distributed with a VMware image of 64 bit Ubuntu Server 12.04 [Precise Pangolin] with Hadoop, HBase, Sqoop, Hive and Pig installed and configured using Apache Bigtop.
Prashanth Babu has 9+ years of experience in software development predominantly in Java and JavaEE. He is working with NTT DATA Global Delivery Services (previously Keane India Pvt. Ltd.) on an R & D initiative on Big Data using Apache Hadoop Ecosystem. Also, an avid Android enthusiast with experience in Android App Development.