2014-10-07 23 views
1
]$ cat webccess.txt 
mark,yahoo.com,6 
sam,google.com,7 
john,yahoo.com,3 
patrick,cnn.com,8 
mary,facebook.com,1 
mark,yahoo.com,4 
john,bbc.com,10 
andrew,twitter.com,3 
patrick,twitter.com,9 

我在Cloudera的快速VM色調 - 豬shell中運行以下任務(步兵)阿帕奇豬 - 舉例說明命令錯誤

grunt> stage1 = LOAD '/user/cloudera/webaccess.txt' USING PigStorage(',') AS (name:chararray, website:chararray, access:int); 
grunt> DUMP stage1; 
grunt> stage2 = FILTER stage1 by access >= 8; 
grunt> stage3 = GROUP stage1 by name; 
grunt> stage4 = FOREACH stage3 GENERATE group as GROUPS, MAX(stage1.access); 
grunt> DUMP stage4; 

OUTPUT:

(sam,7) 
(john,10) 
(mark,6) 
(mary,1) 
(andrew,3) 
(patrick,9) 

直到這每一件事情是罰款。

當我申請說明命令對關係STAGE4審查,我得到的錯誤,如下圖所示,

grunt> ILLUSTRATE stage4; 

2014-10-07 04:02:43,639 [main] WARN org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead, use fs.defaultFS 
2014-10-07 04:02:43,642 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost.localdomain:8020 
2014-10-07 04:02:43,643 [main] WARN org.apache.hadoop.conf.Configuration - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum 
2014-10-07 04:02:43,643 [main] WARN org.apache.hadoop.conf.Configuration - dfs.https.address is deprecated. Instead, use dfs.namenode.https-address 
2014-10-07 04:02:43,643 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost.localdomain:8021 
2014-10-07 04:02:43,799 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 
2014-10-07 04:02:43,800 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 
2014-10-07 04:02:43,800 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 
2014-10-07 04:02:43,804 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job 
2014-10-07 04:02:43,805 [main] ERROR org.apache.pig.pen.ExampleGenerator - Error reading data. Internal error creating job configuration. 
java.lang.RuntimeException: Internal error creating job configuration. 
at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:160) 
at org.apache.pig.PigServer.getExamples(PigServer.java:1182) 
at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:739) 
at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:626) 
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:323) 
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) 
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) 
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) 
at org.apache.pig.Main.run(Main.java:538) 
at org.apache.pig.Main.main(Main.java:157) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
at java.lang.reflect.Method.invoke(Method.java:597) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:208) 
2014-10-07 04:02:43,868 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Encountered IOException. Exception 
Details at logfile: /dev/null 

我在學習階段,由於這個錯誤,我不能夠移動到下一個主題。

在開始此任務之前,我首先打開Hue-Pig Shell(Grunt)時,發現以下警告。

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName 
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName 
at java.net.URLClassLoader$1.run(URLClassLoader.java:202) 
at java.security.AccessController.doPrivileged(Native Method) 
at java.net.URLClassLoader.findClass(URLClassLoader.java:190) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:306) 
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:247) 
Could not find the main class: org.apache.hadoop.util.PlatformName. Program will exit. 
which: no hadoop in ((null)) 
which: no /usr/lib/hadoop/bin/hadoop in ((null)) 
dirname: missing operand 
Try `dirname --help' for more information. 
2014-10-07 03:18:27,802 [main] INFO org.apache.pig.Main - Apache Pig version 0.11.0-cdh4.7.0 (rexported) compiled May 28 2014, 11:05:48 
2014-10-07 03:18:27,803 [main] INFO org.apache.pig.Main - Logging error messages to: /dev/null 
2014-10-07 03:18:28,758 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/cloudera/.pigbootup not found 
2014-10-07 03:18:30,436 [main] WARN org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead, use fs.defaultFS 
2014-10-07 03:18:30,444 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost.localdomain:8020 
2014-10-07 03:18:37,832 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost.localdomain:8021 
2014-10-07 03:18:37,842 [main] WARN org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead, use fs.defaultFS 

回答

1

我沒有面對任何問題,說明命令工作正常。你可以嘗試先在本地模式下執行嗎?

$pig -x local 
    grunt> stage1 = LOAD 'input.txt' USING PigStorage(',') AS (name:chararray, website:chararray, access:int); 
    grunt> stage2 = FILTER stage1 by access >= 8; 
    grunt> stage3 = GROUP stage1 by name; 
    grunt> stage4 = FOREACH stage3 GENERATE group as GROUPS, MAX(stage1.access); 
    grunt> DUMP stage4; 
    (sam,7) 
    (john,10) 
    (mark,6) 
    (mary,1) 
    (andrew,3) 
    (patrick,9) 
    grunt> ILLUSTRATE stage4; 
    ---------------------------------------------------------------------------- 
    | stage1  | name:chararray  | website:chararray  | access:int  | 
    ---------------------------------------------------------------------------- 
    |   | john    | yahoo.com    | 3    | 
    |   | john    | bbc.com    | 10    | 
    ---------------------------------------------------------------------------- 
    -------------------------------------------------------------------------------------------------------------------------- 
    | stage3  | group:chararray  | stage1:bag{:tuple(name:chararray,website:chararray,access:int)}      | 
    -------------------------------------------------------------------------------------------------------------------------- 
    |   | john    | {(john, yahoo.com, 3), (john, bbc.com, 10)}           | 
    |   | john    | {(john, yahoo.com, 3), (john, bbc.com, 10)}           | 
    -------------------------------------------------------------------------------------------------------------------------- 
    ------------------------------------------------ 
    | stage4  | GROUPS:chararray  | :int  | 
    ------------------------------------------------ 
    |   | john     | 10  | 
    ------------------------------------------------ 
+1

感謝siva的回覆,我在本地模式下執行了它,但仍然面臨着錯誤。請線程。 – Green 2014-10-07 12:24:43

1

似乎是classpath問題。請檢查所有需要的jar包是否在classpath中指定。請檢查this thread瞭解更多詳情