The Fifth Elephant 2012

Finding the elephant in the data.

Prashanth Babu

@p7h

Hands-on introduction to Pig

Submitted Jun 10, 2012

  • Pig is a high-level platform for creating MapReduce programs used with Hadoop for analyzing Big Data.
  • This is a 2 hour workshop on intro to Pig.
  • Workshop aims at live-coding introductory session for analyzing Big Data using Pig.

Outline

This workshop will include discussion on:

  • Basics of Hadoop
  • Basics of Pig and PigLatin
  • Pig vs MapReduce
  • Pig vs SQL

And also:

  • Live-coding session on Pig for analyzing huge sample data.
  • Checking the visualization of Pig MapReduce Jobs with Twitter Ambrose

Requirements

  • Basic understanding of Hadoop, HDFS and MapReduce.
  • Laptop with VMware Player or Oracle VirtualBox installed.
  • Please download Cloudera Demo VM from https://ccp.cloudera.com/display/SUPPORT/Cloudera’s+Hadoop+Demo+VM
  • Alternatively, a USB flash drive will be distributed with a VMware image of 64 bit Ubuntu Server 12.04 [Precise Pangolin] with Hadoop, HBase, Sqoop, Hive and Pig installed and configured using Apache Bigtop.

Speaker bio

Prashanth Babu has 9+ years of experience in software development predominantly in Java and JavaEE. He is working with NTT DATA Global Delivery Services (previously Keane India Pvt. Ltd.) on an R & D initiative on Big Data using Apache Hadoop Ecosystem. Also, an avid Android enthusiast with experience in Android App Development.

http://gplus.to/Prashanth

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures