0
我寫了一個Hadoop map-reduce程序,現在我想在virtual box的同一臺計算機上運行的Cloadera Hadoop distribution上進行測試。遠程運行Hadoop map-reduce作業會導致EOFException?
這裏是我怎麼提交的map-reduce任務:
public class AvgCounter extends Configured implements Tool{
public int run(String[] args) throws Exception {
Job mrJob = Job.getInstance(new Cluster(getConf()), getConf());
mrJob.setJobName("Average count");
mrJob.setJarByClass(AvgCounter.class);
mrJob.setOutputKeyClass(IntWritable.class);
mrJob.setOutputValueClass(Text.class);
mrJob.setMapperClass(AvgCounterMap.class);
mrJob.setCombinerClass(AvgCounterReduce.class);
mrJob.setReducerClass(AvgCounterReduce.class);
mrJob.setInputFormatClass(TextInputFormat.class);
mrJob.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.setInputPaths(mrJob, new Path("/user/test/testdata.csv"));
FileOutputFormat.setOutputPath(mrJob, new Path("/user/test/result.txt"));
mrJob.setWorkingDirectory(new Path("/tmp"));
return mrJob.waitForCompletion(true)? 1: 0;
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://192.168.5.50:9000");
conf.set("mapreduce.jobtracker.address", "192.168.5.50:9001");
System.exit(ToolRunner.run(conf, new AvgCounter(), args));
}
}
AvgCounterMap
具有空map
方法,做什麼以及AvgCounterReduce
具有空reduce
方法,什麼也不做。當我嘗試運行的主要方法我得到異常以下:
Exception in thread "main" java.io.IOException: Call to /192.168.5.50:9001 failed on local exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1063)
at org.apache.hadoop.ipc.Client.call(Client.java:1031)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
at $Proxy0.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:235)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:275)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:249)
at org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:86)
at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:98)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:74)
at eu.xxx.mapred.AvgCounter.run(AvgCounter.java:22)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
at eu.xxx.mapred.AvgCounter.main(AvgCounter.java:53)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:760)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:698)
運行Hadoop的虛擬Cloudera的機器已經在文件/etc/hadoop/conf/core.site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.5.50:9000</value>
</property>
,並在文件中以下/etc/hadoop/conf/mapred.site.xml
有
<property>
<name>mapred.job.tracker</name>
<value>192.168.5.50:9001</value>
</property>
我也檢查連接到虛擬機檢查通過編寫92.168.5.50:50030
到我的網頁瀏覽器,我得到Hadoop地圖/減少管理員如預期的那樣。那麼是什麼原因導致了這種異常,我該如何擺脫它呢?
謝謝你的任何想法
我也試圖殺死Cloudera機器上的iptables而沒有任何結果。 – drasto
你可以從tasktracker ping這兩個ips嗎? –
你是混合庫(客戶端使用apache hadoop,服務器端是cloudera hadoop?) –