0
我似乎無法獲得MultipleInputs函數來讀取2個單獨的文件進行處理。輸出文件總是顯示爲空白。我通過參考在線示例代碼嘗試學習和調試,但似乎無法工作。Hadoop:MutipleInput函數不能正常工作
public static class Mapper1 extends Mapper<Object, Text, Text, Text>
{
private Text word = new Text();
private final static Text identifier = new Text("a");
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word,identifier);
}
}
}
public static class Reducer extends
Reducer<Text, Text, Text, IntWritable> {
private IntWritable commoncount = new IntWritable();
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
int count1 = 0;
int count2 = 0;
for (Text val : values) {
if(val.equals("a"))
count1++;
else if(val.equals("b"))
count2++;
}
if (count1 != 0 && count2 != 0)
context.write(key,new IntWritable(count1 <= count2 ? count1: count2));
}
}
public static void main(String[] args) throws IOException,
InterruptedException, ClassNotFoundException {
Configuration conf = new Configuration();
Job job1 = new Job(conf, "Testing");
job1.setJarByClass(CommonWords.class);
job1.setMapOutputKeyClass(Text.class);
job1.setMapOutputValueClass(Text.class);
job1.setOutputKeyClass(Text.class);
job1.setOutputValueClass(IntWritable.class);
job1.setReducerClass(reduce.class);
job1.setMapperClass(Mapper1.class);
job1.setMapperClass(Mapper2.class);
MultipleInputs.addInputPath(job1, new Path(args[0]), KeyValueTextInputFormat.class, Mapper1.class);
MultipleInputs.addInputPath(job1, new Path(args[1]), KeyValueTextInputFormat.class, Mapper2.class);
FileOutputFormat.setOutputPath(job1, new Path(args[2]));
job1.waitForCompletion(true);
}
}
「Mapper2」類在哪裏? – Thanga
您好,Mapper2類與Mapper1完全相同,但文本被設置爲「b」。 – gatsby