Apache Pig eval function examples
1) A machine with Ubuntu 14.04 LTS operating system
2) Apache Hadoop 2.6.4 pre installed (How to install Hadoop on Ubuntu 14.04)
3) Apache Pig pre installed (How to install Pig on Ubuntu 14.04)
Pig Eval Functions Examples
Apache Pig provides various built-in functions namely eval, load/store, math, string, bag and tuple functions. Given below is the list of eval functions provided by Apache Pig.
1) AVG - To compute the average of the numerical values within a bag.
2) MAX - To calculate the highest value for a column (numeric values or chararrays) in a single-column bag.
3) MIN - To get the minimum (lowest) value (numeric or chararray) for a certain column in a single-column bag.
4) COUNT - To get the number of elements in a bag, while counting the number of tuples in a bag.
5) DIFF - The DIFF() function of Pig Latin is used to compare two bags (fields) in a tuple. It takes two fields of a tuple as input and matches them. If they match, it returns an empty bag. If they do not match, it finds the elements that exist in one filed (bag) and not found in the other, and returns these elements by wrapping them within a bag.
6) SUBTRACT - The subtract() function of Pig Latin is used to subtract two bags. It takes two bags as inputs and returns a bag which contains the tuples of the first bag that are not in the second bag.
7) IsEmpty - The isEmpty() function of Pig Latin is used to check if a bag or map is empty.
8) Pluck Tuple - After performing operations like join to differentiate the columns of the two schemas, we use the function PluckTuple(). To use this function, first of all, we have to define a string Prefix and we have to filter for the columns in a relation that begin with that prefix.
Step 1 - Change the directory to /usr/local/pig/bin
Step 2 - Enter into grunt shell in MapReduce mode.
Step 3 - Create a student_gpa.txt file.
Step 4 - Add these following lines to student_gpa.txt file. Save and close.
Step 5 - Copy student_gpa.txt from local file system to HDFS. In my case, the student_gpa.txt file are stored in /home/hduser/Desktop/PIG/ directory.
Step 6 - Load student data.
Step 7 - Create a emp_sales.txt file.
Step 8 - Add these following lines to emp_sales.txt file. Save and close.
Step 9 - Create a emp_bonus.txt file.
Step 10 - Add these following lines to emp_bonus.txt file. Save and close.
Step 11 - Copy emp_bonus.txt and emp_sales.txt from local file system to HDFS. In my case, the emp_bonus.txt and emp_sales.txt file are stored in /home/hduser/Desktop/PIG/ directory.
Step 12 - Load employee sales data.
Step 13 - Load employee bonus data.
Please share this blog post and follow me for latest updates on
Labels : Pig Installation Pig Execution Mechanism Pig GRUNT Shell Usage Pig Load and Store Operations Pig Diagnostic Operators Pig Group Example Pig Join Example Pig Cross Example Pig Union Example Pig Split Example Pig Filter Example Pig Distinct Example Pig Foreach Example Pig OrderBy Example Limit Example Pig BagToString Example Pig Concat Example Pig Tokenize Example Pig UDF's Java Example Pig SCRIPT