2015-11-12 24 views
-1

輸入是一個房屋數據列表,其中每個輸入記錄包含有關單個房屋的信息: (地址,城市,州,郵編,價值)。記錄中的五個項目是由逗號(,)分隔的 。輸出應該是每個郵政編碼的平均房屋價值。以下是我目前的代碼:Hadoop java.lang.ArrayIndexOutOfBoundsException:3

public class ziphousevalue1 { 

    public static class ZipHouseValueMapper extends Mapper < LongWritable, Text, Text, IntWritable > { 
     private static final Text zip = new Text(); 
     private static final IntWritable value = new IntWritable(); 

     protected void map(LongWritable offset, Text line, Context context) throws IOException, InterruptedException { 
      String[] tokens = value.toString().split(","); 
      zip.set(tokens[3]); 
      value.set(Integer.parseInt(tokens[4])); 
      context.write(new Text(zip), value); 
     } 
    } 

    public static class ZipHouseValueReducer extends Reducer < Text, IntWritable, Text, DoubleWritable > { 

     private DoubleWritable average = new DoubleWritable(); 

     protected void reduce(Text zip, Iterable <IntWritable> values, Context context) throws IOException, InterruptedException { 
      int count = 0; 
      int sum = 0; 
      for (IntWritable value: values) { 
       sum += value.get(); 
       count++; 
      } 
      average.set(sum/count); 
      context.write(zip, average); 
     } 
    } 

    public static void main(String[] args) throws Exception { 
     Configuration conf = new Configuration(); 
     String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); 
     if (otherArgs.length != 2) { 
      System.err.println("Usage: ziphousevalue <in> <out>"); 
      System.exit(2); 
     } 
     Job job = new Job(conf, "ziphousevalue"); 
     job.setJarByClass(ziphousevalue1.class); 
     job.setMapperClass(ZipHouseValueMapper.class); 
     job.setReducerClass(ZipHouseValueReducer.class); 

     job.setNumReduceTasks(3); 
     job.setOutputKeyClass(Text.class); 
     job.setOutputValueClass(IntWritable.class); 
     FileInputFormat.addInputPath(job, new Path(otherArgs[0])); 
     FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); 
     configure(conf); 
     System.exit(job.waitForCompletion(true) ? 0 : 1); 
    } 

    public static void configure(Configuration conf) { 
     System.out.println("Test+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++"); 

    } 
} 

但是,它會產生以下錯誤。我在這個網站上看過類似的問題,似乎沒有解決問題。我確定輸入文件是正確的。有什麼我應該檢查解決這個錯誤?感謝您的時間。

java.lang.Exception: java.lang.ArrayIndexOutOfBoundsException: 3 
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
Caused by: java.lang.ArrayIndexOutOfBoundsException: 3 
at ziphousevalue1$ZipHouseValueMapper.map(ziphousevalue1.java:29) 
at ziphousevalue1$ZipHouseValueMapper.map(ziphousevalue1.java:24) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) 
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745) 
15/11/11 22:10:42 INFO mapreduce.Job: Job job_local112498506_0001 running in uber mode : false 
15/11/11 22:10:42 INFO mapreduce.Job: map 0% reduce 0% 
15/11/11 22:10:42 INFO mapreduce.Job: Job job_local112498506_0001 failed with state FAILED due to: NA 
15/11/11 22:10:42 INFO mapreduce.Job: Counters: 0 
+1

異常告訴你確切的問題 - 開始調試! – John3136

回答

1

ZipHouseValueMapper.map,您有:

String[] tokens = value.toString().split(","); 
zip.set(tokens[3]); 
value.set(Integer.parseInt(tokens[4])); 

這意味着value必須具有至少5個逗號分隔的序列,但value是一個新創建的IntWritable。當轉換爲String時,它會有至少5個以逗號分隔的序列嗎?似乎不太可能。您可能想用line來代替。

+0

我沒有意識到這一點。它應該是'line.toString()。split(「,」);'非常感謝。它現在有效。 – ARSN

+0

@aguibert我在努力。該網站說我可以在3分鐘內接受答案。 – ARSN

+0

哦,是的,忘了那個限制=) –

相關問題