2015-09-19 21 views
0

我有1個MR工作,其輸出的樣子:爲什麼Context.Write未正常工作expected-的Hadoop地圖降低

128.187.140.171,11 
129.109.6.54,27 
129.188.154.200,44 
129.193.116.41,5 
129.217.186.112,17 

在第二MR作業的映射器代碼,我這樣做;

public void map(LongWritable key, Text value, Context context) 
      throws IOException, InterruptedException { 
     // Parse the input string into a nice map 
     // System.out.println(value.toString()); 
     if (value.toString().contains(",")) { 
      System.out.println("Inside"); 
      String[] arr = value.toString().split(","); 
      if (arr.length > 1) { 
       System.out.println(arr[1]); 
       context.write(new Text(arr[1]), new Text(arr[0])); 
      } 
     } 

打印報表的輸出是正確的:

Inside 
11 
Inside 
27 

但context.write不斷顯示輸出如下:

1,slip4068.sirius.com 
1,hstar.gsfc.nasa.gov 
1,ad11-010.compuserve.com 
1,slip85-2.co.us.ibm.net 
1,stimpy.actrix.gen.nz 
1,j14.ktk1.jaring.my 
1,ad08-009.compuserve.com 

爲什麼我不斷收到的鍵1? 這是我的驅動程序代碼:

public int run(String[] args) throws Exception { 
     // TODO Auto-generated method stub 
     Configuration conf = getConf(); 
     conf.set("mapreduce.output.textoutputformat.separator", ","); 

     Job job = new Job(conf, "WL Demo"); 

     job.setJarByClass(WLDemo.class); 

     job.setMapperClass(WLMapper1.class); 

    job.setReducerClass(WLReducer1.class); 
      job.setInputFormatClass(TextInputFormat.class); 

    job.setOutputKeyClass(Text.class); 

    job.setOutputValueClass(IntWritable.class); 

    Path in = new Path(args[0]); 

    Path out = new Path(args[1]); 

    Path out2 = new Path(args[2]); 

    FileInputFormat.setInputPaths(job, in); 

    FileOutputFormat.setOutputPath(job, out); 

    boolean succ = job.waitForCompletion(true); 
    if (!succ) { 
     System.out.println("Job1 failed, exiting"); 
     return -1; 
    } 
    Job job2 = new Job(conf, "top-k-pass-2"); 
    FileInputFormat.setInputPaths(job2, out); 
    FileOutputFormat.setOutputPath(job2, out2); 
    job2.setJarByClass(WLDemo.class); 
    job2.setMapperClass(WLMapper2.class); 
    // job2.setReducerClass(Reducer1.class); 
    job2.setInputFormatClass(TextInputFormat.class); 

    job2.setMapOutputKeyClass(Text.class); 
    job2.setMapOutputValueClass(Text.class); 
    job2.setNumReduceTasks(1); 
    succ = job2.waitForCompletion(true); 
    if (!succ) { 
     System.out.println("Job2 failed, exiting"); 
     return -1; 
    } 
    return 0; 
} 

我怎樣才能得到我的第二MR工作的重點輸出正確的價值觀?

回答

1

更改job2.setNumReduceTasks(1)job2.setNumReduceTasks(0)。因爲它正在運行將輸出密鑰設置爲1的身份縮減器,因此您應該將1作爲map1輸出中某些記錄的密鑰。

+0

該解決方案的工作,但我仍然不明白爲什麼它顯示1爲關鍵當我使用job2.setNumReduceTasks(1)..它背後的原因是什麼? – DevHelp

+0

您將有一些記錄,其中鍵1爲map2(WLMapper2)輸出。由於reducer的數量設置爲1,因此它將運行一個身份縮減器,通過該身份縮減器按默認升序排序map2輸出中的鍵。這應該是原因。 –

相關問題