2012-03-26 40 views
0

我寫了一個MR程序來估計PI(3.141592 .........)如下,但是我插入了一個問題:地圖()函數的被調用時間與MR作業中地圖任務的數量之間的連接

框架命中的地圖任務的數量是11,以下是輸出(共35行)。但我預計輸出是11行。有什麼我想念的嗎?

內切圓78534096 內切圓78539304 內切圓78540871 內切圓78537925 內切圓78537161 內切圓78544419 內切圓78537045 內切圓78534861 內切圓78545779 內切圓78528890 內切圓78540007 內切圓78542686 內切圓78534539 內切圓78538255 內切圓78543392 內切圓78543191 INCIRCLE 78540938 INCIRCLE 78534882 內切圓78536155 內切圓78545739 內切圓78541807 內切圓78540635 內切圓78547561 內切圓78540521 內切圓78541320 內切圓78537605 內切圓78541379 內切圓78540408 內切圓78536238 內切圓78539614 內切圓78539773 內切圓78537169 內切圓78541707 內切圓78537141 內切圓78538045

// porgramme starts 進口...

公共類PiEstimation {

public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, LongWritable> { 

      private final static Text INCIRCLE    = new Text("INCIRCLE"); 
      private final static LongWritable TimesInAMap = new LongWritable(100000000); 
      private static Random random = new Random(); 

      public class MyPoint { 
        private double x = 0.0; 
        private double y = 0.0; 

        MyPoint(double _x,double _y) { 
          this.x = _x; 
          this.y = _y; 
        } 

        public boolean inCircle() { 
          if (((x-0.5)*(x-0.5) + (y-0.5)*(y-0.5)) <= 0.25) 
            return true; 
          else 
            return false; 
        } 

        public void setPoint(double _x,double _y) { 
          this.x = _x; 
          this.y = _y; 
        } 
      } 
      public void map(LongWritable key, Text value, OutputCollector<Text, LongWritable> output, Reporter reporter) throws IOException { 
          long i = 0; 
          long N = TimesInAMap.get(); 
          MyPoint myPoint = new MyPoint(random.nextDouble(),random.nextDouble()); 
          long sum = 0; 
          while (i < N) { 
          if (myPoint.inCircle()) {           
           sum++; 
          } 
          myPoint.setPoint(random.nextDouble(),random.nextDouble()); 
          i++; 
          } 
          output.collect(INCIRCLE, new LongWritable(sum)); 
          } 
      } 


    public static class Reduce extends MapReduceBase implements Reducer<Text, LongWritable, Text, LongWritable> { 
    public void reduce(Text key, Iterator<LongWritable> values, OutputCollector<Text, LongWritable> output, Reporter reporter) throws IOException { 
     long sum = 0; 
     while (values.hasNext()) { 
     //sum += values.next().get(); 
     output.collect(key, values.next()); 
     } 
     //output.collect(key, new LongWritable(sum)); 
    } 
    } 
    public static void main(String[] args) throws Exception { 
    JobConf conf = new JobConf(PiEstimation.class); 
    conf.setJobName("PiEstimation"); 

    conf.setOutputKeyClass(Text.class); 
    conf.setOutputValueClass(LongWritable.class); 

    conf.setMapperClass(Map.class); 
    conf.setCombinerClass(Reduce.class); 
    conf.setReducerClass(Reduce.class); 

    conf.setInputFormat(TextInputFormat.class); 
    conf.setOutputFormat(TextOutputFormat.class); 
    conf.setNumMapTasks(10); 
    conf.setNumReduceTasks(1); 
    FileInputFormat.setInputPaths(conf, new Path(args[0])); 
    FileOutputFormat.setOutputPath(conf, new Path(args[1])); 

    JobClient.runJob(conf); 
} 

}

回答

2

的推出是由許多因素決定map任務的數量 - 主要是輸入格式,其相關的塊大小的塊輸入文件以及輸入文件本身是否可拆分

單獨調用映射的次數取決於每個映射拆分(映射器正在處理的數據)中的記錄數。

假設你有一個輸入一個100行文本文件 - 很可能,這將通過一個單一的映射器進行處理,但地圖方法被調用100次 - 一次在輸入文件中的每一行

如果您統計輸入文件中的行數 - 即在所有映射器中調用映射的次數。很難確定在每個映射器中映射將被調用多少次。

+0

你是對的,非常感謝 – 2012-03-27 01:47:53

相關問題