2012-11-06 77 views
0

我寫了一個Hadoop map-reduce程序,現在我想在virtual box的同一臺計算機上運行的Cloadera Hadoop distribution上進行測試。遠程運行Hadoop map-reduce作業會導致EOFException?

這裏是我怎麼提交的map-reduce任務:

public class AvgCounter extends Configured implements Tool{ 

    public int run(String[] args) throws Exception { 
     Job mrJob = Job.getInstance(new Cluster(getConf()), getConf()); 
     mrJob.setJobName("Average count"); 

     mrJob.setJarByClass(AvgCounter.class); 
     mrJob.setOutputKeyClass(IntWritable.class); 
     mrJob.setOutputValueClass(Text.class); 
     mrJob.setMapperClass(AvgCounterMap.class); 
     mrJob.setCombinerClass(AvgCounterReduce.class); 
     mrJob.setReducerClass(AvgCounterReduce.class); 
     mrJob.setInputFormatClass(TextInputFormat.class); 
     mrJob.setOutputFormatClass(TextOutputFormat.class); 

     FileInputFormat.setInputPaths(mrJob, new Path("/user/test/testdata.csv")); 
     FileOutputFormat.setOutputPath(mrJob, new Path("/user/test/result.txt")); 
     mrJob.setWorkingDirectory(new Path("/tmp")); 
     return mrJob.waitForCompletion(true)? 1: 0; 
    } 

    public static void main(String[] args) throws Exception { 
     Configuration conf = new Configuration(); 
     conf.set("fs.defaultFS", "hdfs://192.168.5.50:9000"); 
     conf.set("mapreduce.jobtracker.address", "192.168.5.50:9001"); 
     System.exit(ToolRunner.run(conf, new AvgCounter(), args)); 
    } 
} 

AvgCounterMap具有空map方法,做什麼以及AvgCounterReduce具有空reduce方法,什麼也不做。當我嘗試運行的主要方法我得到異常以下:

Exception in thread "main" java.io.IOException: Call to /192.168.5.50:9001 failed on local exception: java.io.EOFException 
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1063) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1031) 
    at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198) 
    at $Proxy0.getProtocolVersion(Unknown Source) 
    at org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:235) 
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:275) 
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:249) 
    at org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:86) 
    at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:98) 
    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:74) 
    at eu.xxx.mapred.AvgCounter.run(AvgCounter.java:22) 
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) 
    at eu.xxx.mapred.AvgCounter.main(AvgCounter.java:53) 
Caused by: java.io.EOFException 
    at java.io.DataInputStream.readInt(DataInputStream.java:375) 
    at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:760) 
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:698) 

運行Hadoop的虛擬Cloudera的機器已經在文件/etc/hadoop/conf/core.site.xml

<property> 
    <name>fs.default.name</name> 
    <value>hdfs://192.168.5.50:9000</value> 
</property> 

,並在文件中以下/etc/hadoop/conf/mapred.site.xml

<property> 
    <name>mapred.job.tracker</name> 
    <value>192.168.5.50:9001</value> 
</property> 

我也檢查連接到虛擬機檢查通過編寫92.168.5.50:50030到我的網頁瀏覽器,我得到Hadoop地圖/減少管理員如預期的那樣。那麼是什麼原因導致了這種異常,我該如何擺脫它呢?

謝謝你的任何想法

+0

我也試圖殺死Cloudera機器上的iptables而沒有任何結果。 – drasto

+0

你可以從tasktracker ping這兩個ips嗎? –

+0

你是混合庫(客戶端使用apache hadoop,服務器端是cloudera hadoop?) –

回答

0

的問題是,客戶使用不同版本的Hadoop API(0.23.0)的,則Hadoop的安裝目錄。