2013-01-15 38 views
6

我看到許多與ClassNotFoundExceptions,「無作業jar文件集」和Hadoop相關的問題。他們中的大多數指向在配置中缺少setJarByClass方法(使用JobConfJob)。我有點困惑,因爲我擊中了這個例外。這是我認爲的一切有關(請讓我知道如果我省略了任何東西):與MapClass相關的Hadoop ClassNotFoundException

echo $CLASS_PATH 
/root/javajars/mysql-connector-java-5.1.22/mysql-connector-java-5.1.22-bin.jar:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u5.jar:. 

代碼(大多是略)

import org.apache.hadoop.mapreduce.Job; 
import org.apache.hadoop.mapreduce.Mapper; 
import org.apache.hadoop.mapreduce.Reducer; 
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; 
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; 
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; 
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; 
import org.apache.hadoop.fs.Path; 
import org.apache.hadoop.conf.Configuration; 
import org.apache.hadoop.conf.Configured; 
import org.apache.hadoop.util.ToolRunner; 
import org.apache.hadoop.util.Tool; 
import org.apache.hadoop.util.GenericOptionsParser; 
import org.apache.hadoop.io.LongWritable; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.io.IntWritable; 

import java.io.IOException; 
import java.util.Iterator; 
import java.lang.System; 
import java.net.URL; 

import java.sql.Connection; 
import java.sql.DriverManager; 
import java.sql.SQLException; 
import java.sql.Statement; 
import java.sql.ResultSet; 

public class QueryTable extends Configured implements Tool { 

    public static class MapClass extends Mapper<Object, Text, Text, IntWritable>{ 

    public void map(Object key, Text value, Context context) 
      throws IOException, InterruptedException { 
      ... 
     } 
    } 

    public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable>{ 
     private IntWritable result = new IntWritable(); 

     public void reduce (Text key, Iterable<IntWritable> values, 
          Context context) throws IOException, InterruptedException { 
      ... 
     } 
    } 

    public int run(String[] args) throws Exception { 
     //Configuration conf = getConf();                                                          
     Configuration conf = new Configuration(); 

     Job job = new Job(conf, "QueryTable"); 
     job.setJarByClass(QueryTable.class); 

     Path in = new Path(args[0]); 
     Path out = new Path(args[1]); 
     FileInputFormat.setInputPaths(job, in); 
     //FileInputFormat.addInputPath(job, in);                                                         
     FileOutputFormat.setOutputPath(job, out); 

     job.setMapperClass(MapClass.class); 
     job.setCombinerClass(Reduce.class); // new                                                        
     job.setReducerClass(Reduce.class); 

     job.setInputFormatClass(TextInputFormat.class); 
     job.setOutputFormatClass(TextOutputFormat.class); 
     job.setOutputKeyClass(Text.class); 
     job.setOutputValueClass(Text.class); 

     System.exit(job.waitForCompletion(true)?0:1); 
     return 0; 
    } 

    public static void main(String[] args) throws Exception { 
     int res = ToolRunner.run(new Configuration(), new QueryTable(), args); 
     System.exit(res); 
    } 
} 

我再編譯,創建罐子,然後運行:

javac QueryTable.java -d QueryTable 
jar -cvf QueryTable.jar -C QueryTable/ . 
hadoop jar QueryTable.jar QueryTable input output 

這裏是個例外:

13/01/14 17:09:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 
**13/01/14 17:09:30 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).** 
13/01/14 17:09:30 INFO input.FileInputFormat: Total input paths to process : 1 
13/01/14 17:09:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
13/01/14 17:09:30 WARN snappy.LoadSnappy: Snappy native library not loaded 
13/01/14 17:09:31 INFO mapred.JobClient: Running job: job_201301081120_0045 
13/01/14 17:09:33 INFO mapred.JobClient: map 0% reduce 0% 
    13/01/14 17:09:39 INFO mapred.JobClient: Task Id : attempt_201301081120_0045_m_000000_0, Status : FAILED 
java.lang.RuntimeException: java.lang.ClassNotFoundException: QueryTable$MapClass 
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1004) 
    at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:217) 
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:602) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) 
    at org.apache.hadoop.mapred.Child$4.run(Child.java:266) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) 
    at org.apache.hadoop.mapred.Child.main(Child.java:260) 
Caused by: java.lang.ClassNotFoundException: QueryTable$MapClass 
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366) 
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354) 
    at java.lang.ClassLoader.loadCl 

對不起,這個巨大的文字牆。我不明白爲什麼我收到關於沒有工作jar文件集的警告。我在我的運行方法中設置它。此外,警告由JobClient發出,在我的代碼中,我使用Job而不是JobClient。如果你有任何想法或反饋,我很感興趣。謝謝你的時間!

編輯

內容罐子:

jar -tvf QueryTable.jar 
    0 Tue Jan 15 14:40:46 EST 2013 META-INF/ 
    68 Tue Jan 15 14:40:46 EST 2013 META-INF/MANIFEST.MF 
3091 Tue Jan 15 14:40:10 EST 2013 QueryTable.class 
3173 Tue Jan 15 14:40:10 EST 2013 QueryTable$MapClass.class 
1699 Tue Jan 15 14:40:10 EST 2013 QueryTable$Reduce.class 
+0

你可以在你的jar上做一個jar -tvf來顯示它的內容(並粘貼回你的問題,而不是作爲註釋) –

回答

3

我能夠在我的源代碼頂部聲明一個包來解決這個問題。

package com.foo.hadoop; 

然後我編譯,創建了jar,並明確地調用了hadoop,並在包名前加上了前綴。

hadoop jar QueryTable.jar com.foo.hadoop.QueryTable input output 

我明白這是大多數人開始時會做的,儘管我認爲它沒有指定包仍然可以工作。這絕對是更好的做法,它讓我可以繼續。

+0

當我將jar編譯爲Runnable JAR文件時,我得到了同樣的問題。我把它改成了正常的JAR,並用你的方法給出了完整的路徑,包括它的包工作正常.. – himanshu

+0

對我不起作用,仍然有'ClassNotFoundException:com.foo.hadoop.SomeClass' – CDT

+0

你的jar創建命令的樣子喜歡?如何運行「jar -tvf your_jar」? – cbrown