1
我運行Hadoop的過程,這需要幾個小時,卻突然因爲某些原因(我不知道)停止給下面的錯誤:與服務器的Hadoop連接突然停止
HadoopTree.mapredUtils.JobResultException: //0/0/0/0 could not be properly divided by SplitSamples
at HadoopTree.TTrain.TreeTrainer_sp$SplitSamplesListener.stateChanged(TreeTrainer_sp.java:335)
at HadoopTree.mapredUtils.JobResultManager.poll(JobResultManager.java:76)
at HadoopTree.TTrain.TreeTrainer_sp.developTree(TreeTrainer_sp.java:577)
at HadoopTree.apps.MainTrainTree.run(MainTrainTree.java:64)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
at HadoopTree.apps.MainTrainTree.main(MainTrainTree.java:26)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at HadoopTree.apps.Driver.main(Driver.java:37)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
我檢查發現這裏的錯誤日誌,我發現出現的錯誤之前,這是寫在二次名稱節點的日誌文件的正常syslog消息:
2015-02-18 08:35:11,834 INFO org.apache.hadoop.security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
2015-02-18 08:35:12,010 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=SHUFFLE, sessionId=
2015-02-18 08:35:12,014 WARN org.apache.hadoop.conf.Configuration: user.name is deprecated. Instead, use mapreduce.job.user.name
2015-02-18 08:35:12,060 WARN org.apache.hadoop.conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
2015-02-18 08:35:12,089 INFO org.apache.hadoop.mapred.Task: Task:attempt_201502172051_0618_r_000003_0 is done. And is in the process of commiting
2015-02-18 08:35:12,091 INFO org.apache.hadoop.mapred.Task: Task 'attempt_201502172051_0618_r_000003_0' done.
而上的時候這個錯誤出現這是寫在二級名稱節點日誌文件中的:
2015-02-18 09:55:08,962 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 0 time(s).
2015-02-18 09:55:09,963 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
2015-02-18 09:55:10,963 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
2015-02-18 09:55:11,964 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
2015-02-18 09:55:12,965 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
2015-02-18 09:55:13,965 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 5 time(s).
2015-02-18 09:55:14,966 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 6 time(s).
2015-02-18 09:55:15,966 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 7 time(s).
2015-02-18 09:55:16,967 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 8 time(s).
2015-02-18 09:55:17,968 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 9 time(s).
2015-02-18 09:55:17,968 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint:
2015-02-18 09:55:17,968 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.net.ConnectException: Call to localhost/127.0.0.1:54310 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:932)
at org.apache.hadoop.ipc.Client.call(Client.java:908)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
at com.sun.proxy.$Proxy4.getEditLogSize(Unknown Source)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:225)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417)
at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025)
at org.apache.hadoop.ipc.Client.call(Client.java:885)
... 4 more
2015-02-18 10:00:18,970 INFO org.apache.hadoop.ipc.Client: Retrying connect
我發現這個錯誤,以及在名稱節點的日誌文件:
java.io.IOException: File /jobtracker/jobsInfo/job_201502172051_0597.info could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1448)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:690)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:342)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1350)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1346)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1344)
我知道了,但是是什麼原因造成的? – Tak 2015-02-18 08:33:21