Hands-on introduction to Pig

Jul 2012

23 Mon

24 Tue

25 Wed

26 Thu

27 Fri 09:30 AM – 05:30 PM IST

28 Sat 09:30 AM – 05:00 PM IST

29 Sun

Make a submission

Nimhans Convention Centre, Bangalore

What are your users doing on your website or in your store? How do you turn the piles of data your organization generates into actionable information? Where do you get complementary data to make yours more comprehensive? What tech, and what techniques?

The Fifth Elephant is a two day conference on big data.

Early Geek tickets are available from fifthelephant.doattend.com.

The proposal funnel below will enable you to submit a session and vote on proposed sessions. It is a good practice introduce yourself and share details about your work as well as the subject of your talk while proposing a session.

Each community member can vote for or against a talk. A vote from each member of the Editorial Panel is equivalent to two community votes. Both types of votes will be considered for final speaker selection.

It’s useful to keep a few guidelines in mind while submitting proposals:

Describe how to use something that is available under a liberal open source license. Participants can use this without having to pay you anything.
Tell a story of how you did something. If it involves commercial tools, please explain why they made sense.
Buy a slot to pitch whatever commercial tool you are backing.

Speakers will get a free ticket to both days of the event. Proposers whose talks are not on the final schedule will be able to purchase tickets at the Early Geek price of Rs. 1800.

Hosted by

The Fifth Elephant

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

All submissions

Previous Next

Hands-on introduction to Pig

Submitted Jun 10, 2012

Section: Big Data Infrastructure & Processing Technical level: Beginner Session type: Workshop

Pig is a high-level platform for creating MapReduce programs used with Hadoop for analyzing Big Data.
This is a 2 hour workshop on intro to Pig.
Workshop aims at live-coding introductory session for analyzing Big Data using Pig.

Outline

This workshop will include discussion on:

Basics of Hadoop
Basics of Pig and PigLatin
Pig vs MapReduce
Pig vs SQL

And also:

Live-coding session on Pig for analyzing huge sample data.
Checking the visualization of Pig MapReduce Jobs with Twitter Ambrose

Requirements

Basic understanding of Hadoop, HDFS and MapReduce.
Laptop with VMware Player or Oracle VirtualBox installed.
Please download Cloudera Demo VM from https://ccp.cloudera.com/display/SUPPORT/Cloudera’s+Hadoop+Demo+VM
Alternatively, a USB flash drive will be distributed with a VMware image of 64 bit Ubuntu Server 12.04 [Precise Pangolin] with Hadoop, HBase, Sqoop, Hive and Pig installed and configured using Apache Bigtop.

Speaker bio

Prashanth Babu has 9+ years of experience in software development predominantly in Java and JavaEE. He is working with NTT DATA Global Delivery Services (previously Keane India Pvt. Ltd.) on an R & D initiative on Big Data using Apache Hadoop Ecosystem. Also, an avid Android enthusiast with experience in Android App Development.

http://gplus.to/Prashanth