我正在使用Hadoop 0.20,我想要兩個減少輸出文件而不是一個輸出。我知道MultipleOutputFormat
在Hadoop 0.20中不起作用。我在Eclipse的項目構建路徑中添加了hadoop1.1.1-core jar文件。但它仍然顯示最後一個錯誤。如何在Hadoop 0.20中使用MultipleoutputFormai?
這裏是我的代碼:
public static class ReduceStage extends Reducer<IntWritable, BitSetWritable, IntWritable, Text>
{
private MultipleOutputs mos;
public ReduceStage() {
System.out.println("ReduceStage");
}
public void setup(Context context) {
mos = new MultipleOutputs(context);
}
public void reduce(final IntWritable key, final Iterable<BitSetWritable> values, Context output) throws IOException, InterruptedException
{
mos.write("text1", key, new Text("Hello"));
}
public void cleanup(Context context) throws IOException {
try {
mos.close();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
而且在run():
FileOutputFormat.setOutputPath(job, ConnectedComponents_Nodes);
job.setOutputKeyClass(MultipleTextOutputFormat.class);
MultipleOutputs.addNamedOutput(job, "text1", TextOutputFormat.class,
IntWritable.class, Text.class);
的錯誤是:
java.lang.NoSuchMethodError: org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputName(Lorg/apache/hadoop/mapreduce/JobContext;Ljava/lang/String;)V
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:409)
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:370)
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:348)
at bitsetmr$ReduceStage.reduce(bitsetmr.java:179)
at bitsetmr$ReduceStage.reduce(bitsetmr.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
我能做些什麼有MultipleOutputFormat
?我是否正確使用了代碼?
建築與1.1.1但運行在0.20不起作用。實際上,0.20會先加載,1.1.1不能覆蓋0.20。 – zsxwing
@zsxwing:那麼如何在hadoop 0.20中使用multipleoutputformat? –
您需要將這些代碼複製到您的項目中,或升級您的hadoop。 – zsxwing