我的Hadoop運行,基本上只是彙總了鑰匙,它的代碼: (映射爲身份映射器)長Hadoop的運行,停留在減少>減少
public void reduce(Text key, Iterator<Text> values,
OutputCollector<Text, Text> results, Reporter reporter) throws IOException {
String res = new String("");
while(values.hasNext())
{
res += values.next().toString();
}
Text outputValue = new Text("<all><id>"+key.toString()+"</id>"+res+"</all>");
results.collect(key, outputValue);
}
它停留在這個水平:
12/11/26 06:19:23 INFO mapred.JobClient: Running job: job_201210240845_0099
12/11/26 06:19:24 INFO mapred.JobClient: map 0% reduce 0%
12/11/26 06:19:37 INFO mapred.JobClient: map 20% reduce 0%
12/11/26 06:19:40 INFO mapred.JobClient: map 80% reduce 0%
12/11/26 06:19:41 INFO mapred.JobClient: map 100% reduce 0%
12/11/26 06:19:46 INFO mapred.JobClient: map 100% reduce 6%
12/11/26 06:19:55 INFO mapred.JobClient: map 100% reduce 66%
我在本地運行它,並看到這個:
12/11/26 06:06:48 INFO mapred.LocalJobRunner:
12/11/26 06:06:48 INFO mapred.Merger: Merging 5 sorted segments
12/11/26 06:06:48 INFO mapred.Merger: Down to the last merge-pass, with 5 segments left of total size: 82159206 bytes
12/11/26 06:06:48 INFO mapred.LocalJobRunner:
12/11/26 06:06:54 INFO mapred.LocalJobRunner: reduce > reduce
12/11/26 06:06:55 INFO mapred.JobClient: map 100% reduce 66%
12/11/26 06:06:57 INFO mapred.LocalJobRunner: reduce > reduce
12/11/26 06:07:00 INFO mapred.LocalJobRunner: reduce > reduce
12/11/26 06:07:03 INFO mapred.LocalJobRunner: reduce > reduce
...
a lot of reduce > reduce ...
...
在年底,完成了這項工作。我想問:
1)它在這個reduce> reduce階段中做了什麼?
2)我該如何改進?
日誌中的任何內容? –