2013-06-18 161 views
0

我有與本文中完全相同的錯誤。 http://lucene.472066.n3.nabble.com/Multinode-cluster-only-recognizes-1-node-td3997585.html多節點羣集只能識別1個活動節點

我該如何解決這個問題?

編輯:

我們要運行一個2節點集羣。我們的代碼完美工作。 我們有一個主節點和一個從節點。因爲我們要使用主節點也作爲一個奴隸,我們已經配置了主從文件爲:

conf/master: 
master 

conf/slave: 
master 
slave 

當我將主節點上運行斌/ start-all.sh,JPS給這些預期:

從節點上
namenode 
secondarynamenode 
jobtracker 
datanode 
tasktracker 
jps 

JPS給這些預期:

datanode 
tasktracker 
jps 

一切都很正常。我們的配置mapred-site,核心站點知道主IP和端口。複製因子在hdfs-site.xml中設置爲2。

在此配置上運行mapreduce應用程序。但我想它只能在masternode的jobtracker上運行。當我看着JobTracker的用戶界面,節點數量爲1

另一種情形:

如果我不想使用主作爲一個奴隸還可以,我改變主機和從機的文件是這樣的:

conf/masters: 
master 

conf/slaves: 
slave 

現在的大師rnode JPS給出:從屬節點上

namenode 
secondarynamenode 
jobtracker 
jps 

JPS給這些如期望的那樣

datanode 
tasktracker 
jps 

在這個配置中,它給了我「只能複製0而不是1」的錯誤。我在最後添加了完整的控制檯輸出。

順便說一下hadoop_home目錄路徑對於兩個節點都是相同的。這不再是一個問題。

可能是什麼問題?

完整的控制檯輸出:

[[email protected] hadoop-1.1.2]$ bin/hadoop jar /home/adminuser/Desktop/proje/proje.jar arkadasoner.Main hdfs://10.0.2.15:9000/input/id.txt hdfs://10.0.2.15:9000/output/x.txt hdfs://10.0.2.15:9000/output/y.txthdfs://10.0.2.15:9000/output/z.txt 

13/06/16 14:36:59 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 
13/06/16 14:36:59 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hadoop-adminuser/mapred/staging/adminuser/.staging/job_201306161433_0001/job.jar could only be replicated to 0 nodes, instead of 1 
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639) 
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:736) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:601) 
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) 
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) 
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) 

    at org.apache.hadoop.ipc.Client.call(Client.java:1107) 
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) 
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:601) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62) 
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source) 
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3686) 
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3546) 
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2749) 
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2989) 

13/06/16 14:36:59 WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null 
13/06/16 14:36:59 WARN hdfs.DFSClient: Could not get block locations. Source file "/tmp/hadoop-adminuser/mapred/staging/adminuser/.staging/job_201306161433_0001/job.jar" - Aborting... 
13/06/16 14:36:59 INFO mapred.JobClient: Cleaning up the staging area hdfs://10.0.2.15:9000/tmp/hadoop-adminuser/mapred/staging/adminuser/.staging/job_201306161433_0001 
13/06/16 14:36:59 ERROR security.UserGroupInformation: PriviledgedActionException as:adminuser cause:org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hadoop-adminuser/mapred/staging/adminuser/.staging/job_201306161433_0001/job.jar could only be replicated to 0 nodes, instead of 1 
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639) 
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:736) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:601) 
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) 
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) 
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) 

Exception in thread "main" org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hadoop-adminuser/mapred/staging/adminuser/.staging/job_201306161433_0001/job.jar could only be replicated to 0 nodes, instead of 1 
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639) 
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:736) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:601) 
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) 
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) 
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) 

    at org.apache.hadoop.ipc.Client.call(Client.java:1107) 
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) 
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:601) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62) 
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source) 
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3686) 
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3546) 
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2749) 
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2989) 
13/06/16 14:36:59 ERROR hdfs.DFSClient: Failed to close file /tmp/hadoop-adminuser/mapred/staging/adminuser/.staging/job_201306161433_0001/job.jar 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hadoop-adminuser/mapred/staging/adminuser/.staging/job_201306161433_0001/job.jar could only be replicated to 0 nodes, instead of 1 
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639) 
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:736) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:601) 
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) 
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) 
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) 

    at org.apache.hadoop.ipc.Client.call(Client.java:1107) 
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) 
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:601) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62) 
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source) 
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3686) 
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3546) 
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2749) 
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2989) 
+0

你介意在這裏輸入'你的'問題嗎?另外,顯示日誌會有幫助。我不太明白這個問題。 JPS在你的主機和從機上顯示了什麼?你有多少奴隸? – Tariq

回答

1

請您檢查,在每一個主+從配置文件使用一個名稱IP地址或主機像HDFS://主:54310(它應該是每個主相同,奴隸)。其中master是我的/ etc/hosts文件中指向主節點的主機名。

我也遇到了同樣的問題,但我在所有節點上都使用hdfs:// localhost:54310,然後將其更改爲hdfs:// master:54310或hdfs:// xxxx:54310其中xxxx是主節點的地址。