減少的Hadoop功能不起作用

我學習的Hadoop。我用Java編寫了簡單的程序。程序必須對單詞進行計數（並且創建帶有單詞和每個單詞出現次數的文件），但程序僅創建一個包含所有單詞的文件，並且在每個單詞附近編號爲「1」。它看起來像：減少的Hadoop功能不起作用

RMD 1
RMD 1
RMD 1
RMD 1
rmdaxsxgb 1

但我想：

RMD 4
rmdaxsxgb 1

我的理解，只能地圖功能。（我試圖評論減少功能，並有相同的結果）。

我的代碼（這是一個典型的例子，MapReduce的程序，它可以在互聯網或約Hadoop的書很容易瑤池）：

public class WordCount { 

public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> { 
    private final static IntWritable one = new IntWritable(1); 
    private Text word = new Text(); 

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { 
     String line = value.toString(); 
     StringTokenizer tokenizer = new StringTokenizer(line); 
     while (tokenizer.hasMoreTokens()) { 
      word.set(tokenizer.nextToken()); 
      context.write(word, one); 
     } 
    } 
} 

public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> { 

    public void reduce(Text key, Iterator<IntWritable> values, Context context) 
     throws IOException, InterruptedException { 
     int sum = 0; 
     while (values.hasNext()) { 
      sum += values.next().get(); 
     } 
     context.write(key, new IntWritable(sum)); 
    } 
} 


public static void main(String[] args) throws Exception { 
     Configuration conf = new Configuration(); 

     Job job = new Job(conf, "wordcount"); 
     job.setJarByClass(WordCount.class); 

     job.setOutputKeyClass(Text.class); 
     job.setOutputValueClass(IntWritable.class); 

     job.setMapperClass(Map.class); 
     job.setReducerClass(Reduce.class); 

     job.setInputFormatClass(TextInputFormat.class); 
     job.setOutputFormatClass(TextOutputFormat.class); 

     FileInputFormat.addInputPath(job, new Path(args[0])); 
     FileOutputFormat.setOutputPath(job, new Path(args[1])); 

     job.waitForCompletion(true); 
    } }

我使用亞馬遜網絡服務的Hadoop，不明白爲什麼它不能正常工作。

來源

2015-05-01 Ales

這可能是因爲這些API的混搭。 hadoop有兩個API，舊版本是mapred，最新版本是mapreduce。

在最新的API中，reducer處理值爲Iterable，與Iterator（舊API）的值相比，如代碼中所示。

嘗試 -

來源

2015-05-01 13:56:26

感謝，我試了一下，幫，但應該有'可迭代 values'，你有一個錯字。 – Ales

@Ales：謝謝，編輯 –

看起來沒有減速在Hadoop集羣上運行。您可以通過三種方式進行設置。你可以在你的mapred-site.xml中設置它。設置該屬性一樣

<property> 
<name>mapred.reduce.tasks</name> 
<value>1</value> 
</property>

或通過像

-D mapred.reduce.tasks=1

在命令行設置，或通過在主類中定義它

job.setNumReduceTasks(1);

要永久設定所有的工作，你應該在你的mapred-site.xml中設置屬性。

來源

2015-05-01 13:11:26 salmanbw

減少的Hadoop功能不起作用

回答

相關問題