2016-11-09 72 views
0

我試圖在Mapreduce中實現簡單的組。下面Reducer類在Hadoop中沒有按預期工作MapReduce

我輸入文件中給出:

7369,SMITH,CLERK,800,20 
7499,ALLEN,SALESMAN,1600,30 
7521,WARD,SALESMAN,1250,30 
7566,JONES,MANAGER,2975,20 
7654,MARTIN,SALESMAN,1250,30 
7698,BLAKE,MANAGER,2850,30 
7782,CLARK,MANAGER,2450,10 
7788,SCOTT,ANALYST,3000,20 
7839,KING,PRESIDENT,5000,10 
7844,TURNER,SALESMAN,1500,30 
7876,ADAMS,CLERK,1100,20 
7900,JAMES,CLERK,950,30 
7902,FORD,ANALYST,3000,20 
7934,MILLER,CLERK,1300,10 

我的映射類:

public class Groupmapper extends Mapper<Object,Text,IntWritable,IntWritable> { 
    @Override 
    public void map(Object key, Text value, Context context) throws IOException, InterruptedException{ 
     String line = value.toString(); 
     String[] parts=line.split(","); 
     String token1=parts[3]; 
     String token2=parts[4]; 
     int deptno=Integer.parseInt(token2); 
     int sal=Integer.parseInt(token1); 
     context.write(new IntWritable(deptno),new IntWritable(sal)); 
    }  
} 

減速機類:

public class Groupreducer extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> { 
    IntWritable result=new IntWritable(); 
    public void Reduce(IntWritable key,Iterable<IntWritable> values, Context context) throws IOException, InterruptedException{ 
     int sum=0; 
     for(IntWritable val:values){ 
      sum+=val.get(); 
     } 
     result.set(sum); 
     context.write(key,result); 
    } 
} 

驅動程序類:

public class Group { 
    public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException { 
     Configuration conf=new Configuration(); 
     Job job=Job.getInstance(conf,"Group"); 
     job.setJarByClass(Group.class); 
     job.setMapperClass(Groupmapper.class); 
     job.setCombinerClass(Groupreducer.class); 
     job.setReducerClass(Groupreducer.class); 
     job.setOutputKeyClass(IntWritable.class); 
     job.setOutputValueClass(IntWritable.class); 
     FileInputFormat.addInputPath(job, new Path(args[0])); 
     FileOutputFormat.setOutputPath(job, new Path(args[1])); 
     System.exit(job.waitForCompletion(true) ? 0 : 1);   
    } 
} 

預期的輸出應該是:

10  8750 
20  10875 
30  9400 

但它打印如下輸出。它沒有彙總這些值。 它像身份縮減器一樣工作。

10  1300 
10  5000 
10  2450 
20  1100 
20  3000 
20  800 
20  2975 
20  3000 
30  1500 
30  1600 
30  2850 
30  1250 
30  1250 
30  950 

減速機功能不能正常工作。

回答

3

看起來好像沒有使用reduce。因此仔細看看你的reducer將是調試的下一步。

如果你添加一個@Override到你的reduce方法(就像你在地圖方法中做的那樣),你會看到你得到一個Method does not override method from its superclass錯誤。這意味着hadoop不會使用您的reduce方法,並且將使用默認標識實現。

的問題是,你必須:

public void Reduce(IntWritable key,Iterable<IntWritable> values, Context context)

,它應該是:

public void reduce(IntWritable key,Iterable<IntWritable> values, Context context)

唯一的區別是方法必須以小寫r開始的名稱。