Apache Pig 0.15.0 Installation on ubuntu 14.04

posted on Nov 20th, 2016

Apache Pig

Apache Pig is a high-level platform for creating programs that run on Apache Hadoop. The language for this platform is called Pig Latin. Pig can execute its Hadoop jobs in MapReduce, Apache Tez, or Apache Spark. Pig Latin abstracts the programming from the Java MapReduce idiom into a notation which makes MapReduce programming high level, similar to that of SQL for RDBMSs. Pig Latin can be extended using User Defined Functions (UDFs) which the user can write in Java, Python, JavaScript, Ruby or Groovy and then call directly from the language.

Pre Requirements

1) A machine with Ubuntu 14.04 LTS operating system

2) Apache Hadoop 2.6.4 pre installed (How to install Hadoop on Ubuntu 14.04)

3) Apache Pig 0.15.0 Software (Download Here)

Pig Installation

Installation Steps

Step 1 - Creating pig directory. Open a new terminal(CTRL + ALT + T) and enter the following command.

$ sudo mkdir /usr/local/pig

Step 2 - Change the ownership and permissions of the directory /usr/local/pig. Here 'hduser' is an Ubuntu username.

$ sudo chown -R hduser /usr/local/pig
$ sudo chmod -R 755 /usr/local/pig

Step 3 - Switch User, is used by a computer user to execute commands with the privileges of another user account.

$ su hduser

Step 4 - Change the directory to /home/hduser/Desktop , In my case the downloaded pig-0.15.0.tar.gz file is in /home/hduser/Desktop folder. For you it might be in /downloads folder check it.

$ cd /home/hduser/Desktop/

Step 5 - Untar the pig-0.15.0.tar.gz file.

$ tar xzf /home/hduser/Desktop/pig-0.15.0.tar.gz

Step 6 - Move the contents of pig-0.15.0 folder to /usr/local/pig

$ mv pig-0.15.0/* /usr/local/pig

Step 7 - Edit $HOME/.bashrc file by adding the pig path.

$ sudo gedit $HOME/.bashrc

$HOME/.bashrc file. Add the following lines

export PIG_HOME=/usr/local/pig
export PATH=$PIG_HOME/bin:$PATH
export PIG_CLASSPATH=$HADOOP_HOME/etc/hadoop

Step 8 - Reload your changed $HOME/.bashrc settings

$ source $HOME/.bashrc

Step 9 - Change the directory to /usr/local/pig/conf

$ cd /usr/local/pig/conf

Step 10 - Verify Pig Installation.

$ pig -version 

Apache Pig Installation on Ubuntu 14.04

Step 11 - Before fire up apache pig you need to start history server daemon of hadoop otherwise you will get some runtime exception. You can see that in the terminal.

Step 12 - Edit mapred-site.xml.

mapred-site.xml

Step 13 - Add the following lines to mapred-site.xml. Dont forget to mention the host and port number for history server.

<property>
 <name>mapreduce.jobhistory.address</name>
 <value>host:port</value>
</property>

Step 14 - Change the directory to /usr/local/hadoop/sbin

$ cd /usr/local/hadoop/sbin

Step 15 - Start the History Server.

$ mr-jobhistory-daemon.sh --config /usr/local/hadoop/etc/hadoop start historyserver

Step 16 - Change the directory to /usr/local/pig/bin

$ cd /usr/local/pig/bin

Step 17 - Enter into grunt shell in local mode.

$ ./pig -x local

OR

Step 18 - Enter into grunt shell in MapReduce mode.

$ ./pig -x mapreduce

Please share this blog post and follow me for latest updates on

facebook             google+             twitter             feedburner

Previous Post                                                                                          Next Post

Labels : Pig Execution Mechanism   Pig GRUNT Shell Usage   Pig Load and Store Operations   Pig Diagnostic Operators   Pig Group Example   Pig Join Example   Pig Cross Example   Pig Union Example   Pig Split Example   Pig Filter Example   Pig Distinct Example   Pig Foreach Example   Pig OrderBy Example   Pig OrderBy Example   Pig Eval Functions Example   Pig BagToString Example   Pig Concat Example   Pig Tokenize Example   Pig UDF's Java Example   Pig SCRIPT