I cannot use my udf for some fields, but I can do it on others. If I use my first field, ipAddress , udf works as intended. However, if I change it to date , I got error 1066. Here is my script.
Pig Script, which works and calls udf.
REGISTER myudfs.jar; DEFINE HOUR myudfs.HOUR; A = load 'access_log_Jul95' using PigStorage(' ') as (ip:chararray, dash1:chararray, dash2:chararray, date:chararray, date1:chararray, getRequset:chararray, location:chararray, http:chararray, code:int, port:int); B = FOREACH A GENERATE HOUR(ip); dump B;
Pig Script, which does not work, and calls udf
REGISTER myudfs.jar; DEFINE HOUR myudfs.HOUR; A = load 'access_log_Jul95' using PigStorage(' ') as (ip:chararray, dash1:chararray, dash2:chararray, date:chararray, date1:chararray, getRequset:chararray, location:chararray, http:chararray, code:int, port:int); B = FOREACH A GENERATE HOUR(date); dump B;
Pig Script, which works, but does not call udf
REGISTER myudfs.jar; DEFINE HOUR myudfs.HOUR; A = load 'access_log_Jul95' using PigStorage(' ') as (ip:chararray, dash1:chararray, dash2:chararray, date:chararray, date1:chararray, getRequset:chararray, location:chararray, http:chararray, code:int, port:int); B = FOREACH A GENERATE date; dump B;
Sample data
199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET /history/apollo/ HTTP/1.0" 200 6245
Java UDF
package myudfs; import java.io.IOException; import org.apache.pig.EvalFunc; import org.apache.pig.data.Tuple; import org.apache.pig.impl.util.WrappedIOException; public class HOUR extends EvalFunc<String> { @SuppressWarnings("deprecation") public String exec(Tuple input) throws IOException { if (input == null || input.size() == 0) return " "; try{ String str = (String)input.get(0); return str.substring(0, 1); }catch(Exception e){ throw WrappedIOException.wrap("Caught exception processing input row ", e); } } }
Error
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias B
If there is anything else, let me know. I get this error running locally and reduce the map.