REPLACE pig gives an error

Suppose my file is named 'data' and looks like this:

2343234 {23.8375, -2.339921102} {(343.34333, -2.0000022)} 5-23-2013-11-am

I need to convert the second field to a pair of coordinates. So I wrote the following code and called it basic.pig:

A = LOAD 'data' AS (f1:int, f2:chararray, f3:chararray. f4:chararray); B = foreach A generate STRSPLIT(f2,',').$0 as f5, STRSPLIT(f2,',').$1 as f6; C = foreach B generate REPLACE(f5,'{',' ') as f7, REPLACE(f6,'}',' ') as f8; 

and then used (float) to convert the string to float. But the REPLACE command does not work, and I get the following error:

 -bash-3.2$ pig -x local basic.pig 2013-06-24 16:38:45,030 [main] INFO org.apache.pig.Main - Apache Pig version 0.11.1 (r1459641) compiled Mar 22 2013, 02:13:53 2013-06-24 16:38:45,031 [main] INFO org.apache.pig.Main - Logging error messages to: /home/--/p/--test/pig_1372117125028.log 2013-06-24 16:38:45,321 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/isl/pmahboubi/.pigbootup not found 2013-06-24 16:38:45,425 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:/// 2013-06-24 16:38:46,069 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Lexical error at line 7, column 0. Encountered: <EOF> after : "" Details at logfile: /home/--/p/--test/pig_1372117125028.log 

And here are the details of pig_137..log

 Pig Stack Trace --------------- ERROR 1000: Error during parsing. Lexical error at line 7, column 0. Encountered: <EOF> after : "" org.apache.pig.tools.pigscript.parser.TokenMgrError: Lexical error at line 7, column 0. Encountered: <EOF> after : "" at org.apache.pig.tools.pigscript.parser.PigScriptParserTokenManager.getNextToken(PigScriptParserTokenManager.java:3266) at org.apache.pig.tools.pigscript.parser.PigScriptParser.jj_ntk(PigScriptParser.java:1134) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:104) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:604) at org.apache.pig.Main.main(Main.java:157) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:197) ================================================================================ 
+4
source share
2 answers

I have the following data:

 2724 1919 2012-11-18T23:57:56.000Z {(33.80981975),(-118.105289)} 2703 6401 2012-11-18T23:57:56.000Z {(55.83525609),(-4.07733138)} 1200 4015 2012-11-18T23:57:56.000Z {(41.49609152),(13.8411998)} 7104 9227 2012-11-18T23:57:56.000Z {(-24.95351118),(-53.46538723)} 

and i can do this:

 A = LOAD 'my_tsv_data' USING PigStorage('\t') AS (id1:int, id2:int, date:chararray, loc:chararray); B = FOREACH A GENERATE REPLACE(loc,'\\{|\\}|\\(|\\)',''); C = LIMIT B 10; DUMP C; 
+3
source

This error

 ERROR 1000: Error during parsing. Lexical error at line 7, column 0. Encountered: <EOF> after : "" 

came to me because I used different types of quotes. I started with "and ended with" or ", and it took quite a while to find out what went wrong. So this had nothing to do with line 7 (my script was not so long and I reduced the data to four rows, which naturally didn’t help), has nothing to do with column 0, has nothing to do with EOF data, and hardly anything related to "marks that I did not use. So a pretty misleading error message.

I found the reason using the command line shell.

+2
source

Source: https://habr.com/ru/post/1487977/


All Articles