Apache Pig group example
1) A machine with Ubuntu 14.04 LTS operating system
2) Apache Hadoop 2.6.4 pre installed (How to install Hadoop on Ubuntu 14.04)
3) Apache Pig pre installed (How to install Pig on Ubuntu 14.04)
Pig Group Example
The group operator is used to group the data in one or more relations. It collects the data having the same key.
Step 1 - Change the directory to /usr/local/pig/bin
Step 2 - Enter into grunt shell in MapReduce mode.
Step 3 - Create a student_details.txt file.
Step 4 - Add these following lines to student_details.txt file.
Step 5 - Copy student_details.txt from local file system to HDFS. In my case, the employee.txt file is stored in /home/hduser/Desktop/PIG directory.
Step 6 - Load Data.
Step 7 - Group data by age as a key.
Step 8 - Group data by multiple keys.
Step 9 - Group by all.
Grouping Two Relations using Cogroup
The cogroup operator works more or less in the same way as the group operator. The only difference between the two operators is that the group operator is normally used with one relation, while the cogroup operator is used in statements involving two or more relations.
Step 10 - Create a employee_details.txt file.
Step 11 - Add these following lines to employee_details.txt file.
Step 12 - Create a student_details.txt file.
Step 13 - Add these following lines to student_details.txt file.
Step 14 - Copy student_details.txt and employee_details.txt from local file system to HDFS. In my case, the employee.txt and employee_details.txt file is stored in /home/hduser/Desktop/PIG directory.
Step 15 - Load student_details and employee_details data.
Step 16 - Cogroup by student age and employee age as keys
Please share this blog post and follow me for latest updates on
Labels : Pig Installation Pig Execution Mechanism Pig GRUNT Shell Usage Pig Load and Store Operations Pig Diagnostic Operators Pig Join Example Pig Cross Example Pig Union Example Pig Split Example Pig Filter Example Pig Distinct Example Pig Foreach Example Pig OrderBy Example Limit Example Pig Eval Functions Example Pig BagToString Example Pig Concat Example Pig Tokenize Example Pig UDF's Java Example Pig SCRIPT