Apache Pig user defined functions (UDFs) java example
1) A machine with Ubuntu 14.04 LTS operating system
2) Apache Hadoop 2.6.4 pre installed (How to install Hadoop on Ubuntu 14.04)
3) Apache Pig pre installed (How to install Pig on Ubuntu 14.04)
Pig User Defined functions (UDF's) Java Example
Using Java, you can write UDFâs involving all parts of the processing like data load/store, column transformation, and aggregation. Since Apache Pig has been written in Java, the UDFâs written using Java language work efficiently compared to other languages.
While writing UDF;s using Java, we can create and use the following three types of functions
1) Filter Functions
The filter functions are used as conditions in filter statements. These functions accept a Pig value as input and return a Boolean value.
2) Eval Functions
The Eval functions are used in FOREACH-GENERATE statements. These functions accept a Pig value as input and return a Pig result.
3) Algebraic Functions
The Algebraic functions act on inner bags in a FOREACHGENERATE statement. These functions are used to perform full MapReduce operations on an inner bag.
Add these jars to your Java project
Step 1 - Change the directory to /usr/local/pig/bin
Step 2 - Enter into grunt shell in MapReduce mode.
Step 3 - Create jar file of your java project. Creating jar file is left to you.
Step 4 - Copy the jar file into HDFS.
Step 5 - Register. The Register operator is used to registers a JAR file which contains the UDF. By registering the Jar file, users can intimate the location of the UDF to Pig.
Step 6 - Create a employee_new.txt file.
Step 7 - Add these following lines to employee_new.txt file. Save and close. Store into HDFS.
Step 8 - Load employee data.
Let us now convert the names of the employees in to upper case using the UDF sample_eval.
The Define operator is used to assign an alias to a UDF or streaming command.
Please share this blog post and follow me for latest updates on
Labels : Pig Installation Pig Execution Mechanism Pig GRUNT Shell Usage Pig Load and Store Operations Pig Diagnostic Operators Pig Group Example Pig Join Example Pig Cross Example Pig Union Example Pig Split Example Pig Filter Example Pig Distinct Example Pig Foreach Example Pig OrderBy Example Limit Example Pig Eval Functions Example Pig BagToString Example Pig Concat Example Pig Tokenize Example Pig SCRIPT