2017-02-25 131 views
2

我有一個上傳到hdfs的CSV文件。我正在使用opencsv分析器來讀取數據。我在hadoop類路徑中也有我的jar文件,並將其上載到hdfs中的以下位置/jars/opencsv-3.9.jar中。我得到的錯誤也附加了。CSV類未發現異常

這裏是我的代碼片段

public class TermLabelledPapers { 

    public static class InputMapper extends Mapper<LongWritable, Text, Text, Text> { 

    @Override 
    protected void map(LongWritable key, Text value, Context context) 
      throws IOException, InterruptedException { 

     CSVParser parser = new CSVParser(); 
     String[] lines = parser.parseLine(value.toString()); 
     //readEntry.readHeaders(); 
     String doi = lines[0]; 
     String keyphrases = lines[3]; 

     Get g = new Get(Bytes.toBytes(doi.toString())); 
     context.write(new Text(doi), new Text(keyphrases)); 

    } 
} 

public static class PaperEntryReducer extends TableReducer<Text, Text, ImmutableBytesWritable> { 

    @Override 
    protected void reduce(Text doi, Iterable<Text> values, Context context) 
      throws IOException, InterruptedException { 

    } 
} 


public static void main(String[] args) throws Exception { 

    Configuration conf = HBaseConfiguration.create(); 
    conf.set("hbase.zookeeper.quorum", "172.17.25.18"); 
    conf.set("hbase.zookeeper.property.clientPort", "2183"); 
    //add the external jar to hadoop distributed cache 
    //addJarToDistributedCache(CsvReader.class, conf); 

    Job job = new Job(conf, "TermLabelledPapers"); 
    job.setJarByClass(TermLabelledPapers.class); 
    job.setMapperClass(InputMapper.class); 
    job.setMapOutputKeyClass(Text.class); 
    job.setMapOutputValueClass(Text.class); 
    job.addFileToClassPath(new Path("/jars/opencsv-3.9.jar")); 
    FileInputFormat.setInputPaths(job, new Path(args[0])); // "metadata.csv" 

    TableMapReduceUtil.initTableReducerJob("PaperBagofWords", PaperEntryReducer.class, job); 
    job.setReducerClass(PaperEntryReducer.class); 
    job.waitForCompletion(true); 
} 

} 

其運行作業後,顯示出來的錯誤是

Error: java.lang.ClassNotFoundException: com.csvreader.CsvReader 
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) 
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) 
at mcad.TermLabelledPapers$InputMapper.map(TermLabelledPapers.java:69) 
at mcad.TermLabelledPapers$InputMapper.map(TermLabelledPapers.java:1) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) 
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 
+0

添加到hadoop類路徑中?然後使用'hadoop classpath'命令檢查以確保它在那裏。 –

回答

0

理想的情況下,這種錯誤不應該來。如果罐子在Hadoop的類路徑。如果你是一個maven項目,你可以嘗試創建jar-with-dependencies,它將包含所有依賴jar和你的jar。這可以幫助診斷問題。