2
我有一個上傳到hdfs的CSV文件。我正在使用opencsv分析器來讀取數據。我在hadoop類路徑中也有我的jar文件,並將其上載到hdfs中的以下位置/jars/opencsv-3.9.jar中。我得到的錯誤也附加了。CSV類未發現異常
這裏是我的代碼片段
public class TermLabelledPapers {
public static class InputMapper extends Mapper<LongWritable, Text, Text, Text> {
@Override
protected void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
CSVParser parser = new CSVParser();
String[] lines = parser.parseLine(value.toString());
//readEntry.readHeaders();
String doi = lines[0];
String keyphrases = lines[3];
Get g = new Get(Bytes.toBytes(doi.toString()));
context.write(new Text(doi), new Text(keyphrases));
}
}
public static class PaperEntryReducer extends TableReducer<Text, Text, ImmutableBytesWritable> {
@Override
protected void reduce(Text doi, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
}
}
public static void main(String[] args) throws Exception {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", "172.17.25.18");
conf.set("hbase.zookeeper.property.clientPort", "2183");
//add the external jar to hadoop distributed cache
//addJarToDistributedCache(CsvReader.class, conf);
Job job = new Job(conf, "TermLabelledPapers");
job.setJarByClass(TermLabelledPapers.class);
job.setMapperClass(InputMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.addFileToClassPath(new Path("/jars/opencsv-3.9.jar"));
FileInputFormat.setInputPaths(job, new Path(args[0])); // "metadata.csv"
TableMapReduceUtil.initTableReducerJob("PaperBagofWords", PaperEntryReducer.class, job);
job.setReducerClass(PaperEntryReducer.class);
job.waitForCompletion(true);
}
}
其運行作業後,顯示出來的錯誤是
Error: java.lang.ClassNotFoundException: com.csvreader.CsvReader
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at mcad.TermLabelledPapers$InputMapper.map(TermLabelledPapers.java:69)
at mcad.TermLabelledPapers$InputMapper.map(TermLabelledPapers.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
添加到hadoop類路徑中?然後使用'hadoop classpath'命令檢查以確保它在那裏。 –