2017-02-07 47 views
0

我想在Java中運行MapReduce程序有2個輸入文件和2個映射器。Hadoop錯誤:java.io.IOException:類型在映射中的鍵不匹配:expected org.apache.hadoop.io.Text,received org.apache.hadoop.io.LongWritable

下面是代碼:

public class CounterMapper { 
    public static class MyMap1 extends 
      Mapper<LongWritable, Text, Text, LongWritable> { 
     public void map(LongWritable key, Text value, Context context) 
       throws IOException, InterruptedException { 
      String[] line = value.toString().split("\t"); 

      int age = Integer.parseInt(line[26]); 

      context.write(new Text(line[7]), new LongWritable(age)); 
     } 
    } 

    public static class MyMap2 extends Mapper { 
     public void map(LongWritable key, Text value, Context context) 
       throws IOException, InterruptedException { 
      String[] line = value.toString().split("\t"); 
      int age = Integer.parseInt(line[26]); 
      context.write(new Text(line[7]), new LongWritable(age)); 
     } 
    } 

    public static class MyRed extends 
      Reducer<Text, LongWritable, Text, LongWritable> { 
     String line = null; 

     public void reduce(Text key, Iterable<LongWritable> values, 
       Context context) throws IOException, InterruptedException { 
      for (LongWritable value : values) { 
       line = value.toString(); 
      } 
      context.write(key, new LongWritable()); 
     } 
    } 

    public static void main(String[] args) throws Exception { 
     Configuration conf = new Configuration(); 
     @SuppressWarnings("deprecation") 
     Job job = new Job(conf, "ProjectQuestion2"); 
     job.setJarByClass(CounterMapper.class); 
     FileOutputFormat.setOutputPath(job, new Path(args[2])); 

     job.setNumReduceTasks(1); 
     job.setMapperClass(MyMap1.class); 
     job.setMapperClass(MyMap2.class); 
     job.setReducerClass(MyRed.class); 
     job.setMapOutputKeyClass(Text.class); 
     job.setMapOutputValueClass(LongWritable.class); 
     job.setMapOutputKeyClass(Text.class); 
     job.setMapOutputValueClass(LongWritable.class); 

     MultipleInputs.addInputPath(job, new Path(args[0]), 
       TextInputFormat.class, MyMap1.class); 
     MultipleInputs.addInputPath(job, new Path(args[1]), 
       TextInputFormat.class, MyMap2.class); 

     System.exit(job.waitForCompletion(true) ? 0 : 1); 
    } 
} 

和運行工作,我得到下面的錯誤後:

INFO mapreduce.Job: Task Id : attempt_1486434709675_0016_m_000000_2, Status : FAILED Error: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, received org.apache.hadoop.io.LongWritable at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1073) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapreduce.lib.input.DelegatingMapper.run(DelegatingMapper.java:55) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

任何輸入讚賞...謝謝。

回答

0

它有可能使用基類Mapper中的map()方法代替您的方法。因爲這是一個身份映射器(通過),這將與您所看到的錯誤相匹配。

我會做幾件事情:

  1. MyMap2變化MapperMapper<LongWritable, Text, Text, LongWritable>
  2. 確保您的map()方法通過將@Override註釋添加到它們來覆蓋基類Mapper類。

你也可以(改進):

  1. 變化Job job = new Job(conf, "ProjectQuestion2");Job job = Job.getInstance(conf, "ProjectQuestion2");刪除deprication警告。
  2. job.setMapOutputKeyClass()job.setMapOutputValueClass()設置兩次,您可以刪除一對。
相關問題