0
當我在腳本中指定大型目錄樹的根作爲LOAD輸入時,Pig會神祕地失敗。它引發的後端錯誤異常無法洞察發生了什麼。當文件較少時,相同的腳本完美地工作。我可以一次提交一份豬工作多少個文件?
這是一個非常簡單的腳本,你可以看到如下:
SET pig.noSplitCombination true;
raw_record = LOAD '/data/directory/tree/root' USING PigStorage(',');
filtered = FILTER raw_record by $1 == 251068;
filtered_data = FOREACH filtered GENERATE (chararray)$0, (chararray)$1, (chararray)$2;
STORE filtered_data INTO '/data/output/directory/' USING PigStorage();
這裏的錯誤消息我看到:
ERROR 2244: Job scope-594 failed, hadoop does not return any error message
org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job scope-594 failed, hadoop does not return any error message
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:178)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:232)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:608)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
多少個文件可以PIG過程一次?
好像它已經失敗的前端。你能在作業服務器訪問作業設置?你爲什麼要設置pig.noSplitCombination? – LiMuBei