2013-06-24 36 views
4

讓我們假設我的文件被命名爲 '數據',看起來像這樣:豬REPLACE給出錯誤

2343234 {23.8375,-2.339921102} {(343.34333,-2.0000022)} 5-23-2013-11-AM

我需要將第二個字段轉換爲一對座標數字。所以我寫了follwoing代碼,並把它稱爲basic.pig:

A = LOAD 'data' AS (f1:int, f2:chararray, f3:chararray. f4:chararray); 

B = foreach A generate STRSPLIT(f2,',').$0 as f5, STRSPLIT(f2,',').$1 as f6; 

C = foreach B generate REPLACE(f5,'{',' ') as f7, REPLACE(f6,'}',' ') as f8; 

,然後使用(浮動)的字符串轉換爲浮動。但是,命令「取代」不工作,我得到以下錯誤:

-bash-3.2$ pig -x local basic.pig 


2013-06-24 16:38:45,030 [main] INFO org.apache.pig.Main - Apache Pig version 0.11.1 (r1459641) compiled 

Mar 22 2013, 02:13:53 2013-06-24 16:38:45,031 [main] INFO org.apache.pig.Main - Logging error messages to: /home/--/p/--test/pig_1372117125028.log 

2013-06-24 16:38:45,321 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/isl/pmahboubi/.pigbootup not found 

2013-06-24 16:38:45,425 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:/// 

2013-06-24 16:38:46,069 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Lexical error at line 7, column 0. Encountered: <EOF> after : "" 

Details at logfile: /home/--/p/--test/pig_1372117125028.log 

這是pig_137..log

Pig Stack Trace 
--------------- 
ERROR 1000: Error during parsing. Lexical error at line 7, column 0. Encountered: <EOF> after : "" 

org.apache.pig.tools.pigscript.parser.TokenMgrError: Lexical error at line 7, column 0. Encountered: <EOF> after : "" 
    at org.apache.pig.tools.pigscript.parser.PigScriptParserTokenManager.getNextToken(PigScriptParserTokenManager.java:3266) 
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.jj_ntk(PigScriptParser.java:1134) 
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:104) 
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) 
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) 
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) 
    at org.apache.pig.Main.run(Main.java:604) 
    at org.apache.pig.Main.main(Main.java:157) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
    at java.lang.reflect.Method.invoke(Method.java:597) 
    at org.apache.hadoop.util.RunJar.main(RunJar.java:197) 
================================================================================ 

回答

3

我已經得到了數據這樣的細節:

2724 1919 2012-11-18T23:57:56.000Z {(33.80981975),(-118.105289)} 
2703 6401 2012-11-18T23:57:56.000Z {(55.83525609),(-4.07733138)} 
1200 4015 2012-11-18T23:57:56.000Z {(41.49609152),(13.8411998)} 
7104 9227 2012-11-18T23:57:56.000Z {(-24.95351118),(-53.46538723)} 

,我可以這樣做:

A = LOAD 'my_tsv_data' USING PigStorage('\t') AS (id1:int, id2:int, date:chararray, loc:chararray); 
B = FOREACH A GENERATE REPLACE(loc,'\\{|\\}|\\(|\\)','');                         
C = LIMIT B 10;                                   
DUMP C; 
2

這個錯誤

ERROR 1000: Error during parsing. Lexical error at line 7, column 0. Encountered: <EOF> after : "" 

來找我,因爲我曾經使用過不同類型的引號。我從'開始'並以'或'結尾,並且花了相當長的時間才找到出錯的地方。所以它與第7行沒有任何關係(我的腳本沒有那麼長,並且我將數據縮短爲四行,這自然沒有幫助),與第0列無關,與數據的EOF無關,幾乎沒有任何關係做與我沒有用「的痕跡相當如此誤導性的錯誤消息

我找到了癥結通過咕嚕 - 。豬命令shell

+0

這個答案救了我的一天這是一個非常令人誤解的錯誤消息。 –