PIG is a data flow language. It uses PIG LATIN language for BIG Data processing. PIG Latin is high level commands\operators which are very easy to learn and understand. It is mostly useful for non-Java developers for Big data processing.
As part of execution, PIG execution engine undergoes below mentioned conversion steps.
- Logical Plan
- Physical Plan
- Map Reduce Plan
Let us see how to install and configure PIG on Apache Hadoop 2.0 cluster.
Step: 1 [Download the stable version]
Download the stable version from below link.
Release notes link
Step:2
Copy the downloaded package to /usr/lib directory
Step:3 [Unzip and change the owner]
>> sudo tar xzf pig-0.15.0.tar.gz
>> sudo mv pig-0.15.0 pig
>> sudo chown -R huser:hadoop pig
chown command change the owner of the directory pig from root to hadoop user "huser".
Step:4 [Login to Hadoop user "huser" and set the environment variables]
>> su – hduser
Add the below two lines in ~/.bashrc file.
export PIG_HOME=”/usr/lib/pig”
export PATH=$PATH:$PIG_HOME/bin
Step:5 [Source the profile file to reflect the changes]
>> . .bashrc
Step:6 [Verify the PIG command]
>> pig -help
No comments:
Post a Comment