我在Hadoop上使用我的第一個map-reduce代碼時遇到問題。我從「Hadoop:權威指南」中複製了以下代碼,但我無法在單節點Hadoop安裝上運行它。對於InputFormat的默認值,Hadoop ClassCastException
我的代碼片段:
主營:
Job job = new Job();
job.setJarByClass(MaxTemperature.class);
job.setJobName("Max temperature");
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(MaxTemperatureMapper.class);
job.setReducerClass(MaxTemperatureReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
System.exit(job.waitForCompletion(true) ? 0 : 1);
映射:
public void map(LongWritable key, Text value, Context context)
減速機:
地圖public void reduce(Text key, Iterable<IntWritable> values,
Context context)
實現和減少功能也從採摘書只要。但是,當我嘗試執行此代碼,這是錯誤我得到:
INFO mapred.JobClient: Task Id : attempt_201304021022_0016_m_000000_0, Status : FAILED
java.lang.ClassCastException: interface javax.xml.soap.Text
at java.lang.Class.asSubclass(Class.java:3027)
at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:774)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:959)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
解答過去(Hadoop type mismatch in key from map expected value Text received value LongWritable)類似的問題,幫我弄清楚,InputFormatClass應輸入匹配的地圖功能。所以我也嘗試使用job.setInputFormatClass(TextInputFormat.class);在我的主要方法中,但它也沒有解決問題。這裏可能是什麼問題?
這裏是映射器類的實現
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class MaxTemperatureMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private static final int MISSING = 9999;
@Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
String year = line.substring(15, 19);
int airTemperature;
if (line.charAt(45) == '+') { // parseInt doesn't like leading plus signs
airTemperature = Integer.parseInt(line.substring(46, 50));
} else {
airTemperature = Integer.parseInt(line.substring(45, 50));
}
String quality = line.substring(50, 51);
if (airTemperature != MISSING && quality.matches("[01459]")) {
context.write(new Text(year), new IntWritable(airTemperature));
}
}
}
Chris,我不知道從哪裏來的javax.xml.soap.Text。我使用的是org.apache.hadoop.io.Text,我沒有在代碼中包含任何其他第三方jar。 – devj 2013-04-04 15:12:33
您檢查了包含main()方法的類並檢查了您的Mapper類嗎?看起來像映射器中拋出了異常。如果你仍然在尋找,只需發佈整個Mapper類,但關注你寫給上下文的聲明類型。 – 2013-04-04 15:16:33
感謝Chris,我在我的主要方法中使用了javax.xml.soap.Text。忽略它。將其更改爲org.apache.hadoop.io.Text解決了問題。 – devj 2013-04-04 15:37:11