2014-02-05 24 views
0

我試圖在8節點IB(OFED-1.5.3-4.0.42)羣集上部署Hadoop-RDMA,並陷入了以下問題(即文件...只能複製到0節點,而不是1):Hadoop:文件...只能複製到0節點,而不是1

 
[email protected]:~/hadoop-rdma-0.9.8> ./bin/hadoop dfs -copyFromLocal ../pg132.txt /user/frolo/input/pg132.txt 
Warning: $HADOOP_HOME is deprecated. 

14/02/05 19:06:30 WARN hdfs.DFSClient: DataStreamer Exception: java.lang.reflect.UndeclaredThrowableException 
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(Unknown Source) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Unknown Source) 
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source) 
    at org.apache.hadoop.hdfs.From.Code(Unknown Source) 
    at org.apache.hadoop.hdfs.From.F(Unknown Source) 
    at org.apache.hadoop.hdfs.From.F(Unknown Source) 
    at org.apache.hadoop.hdfs.The.run(Unknown Source) 
Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/frolo/input/pg132.txt could only be replicated to 0 nodes, instead of 1 
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(Unknown Source) 
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(Unknown Source) 
    at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.ipc.RPC$Server.call(Unknown Source) 
    at org.apache.hadoop.ipc.rdma.madness.Code(Unknown Source) 
    at org.apache.hadoop.ipc.rdma.madness.run(Unknown Source) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(Unknown Source) 
    at org.apache.hadoop.ipc.rdma.be.run(Unknown Source) 
    at org.apache.hadoop.ipc.rdma.RDMAClient.Code(Unknown Source) 
    at org.apache.hadoop.ipc.rdma.RDMAClient.call(Unknown Source) 
    at org.apache.hadoop.ipc.Tempest.invoke(Unknown Source) 
    ... 12 more` 

14/02/05 19:06:30 WARN hdfs.DFSClient: Error Recovery for null bad datanode[0] nodes == null 
14/02/05 19:06:30 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/frolo/input/pg132.txt" - Aborting... 
14/02/05 19:06:30 INFO hdfs.DFSClient: exception in isClosed 

看來,當我從本地文件系統HDFS開始複製數據不會轉移到的DataNodes。我測試了DataNodes的可用性:

 
[email protected]:~/hadoop-rdma-0.9.8> ./bin/hadoop dfsadmin -report 
Warning: $HADOOP_HOME is deprecated. 

Configured Capacity: 0 (0 KB) 
Present Capacity: 0 (0 KB) 
DFS Remaining: 0 (0 KB) 
DFS Used: 0 (0 KB) 
DFS Used%: �% 
Under replicated blocks: 0 
Blocks with corrupt replicas: 0 
Missing blocks: 0` 

------------------------------------------------- 
Datanodes available: 0 (4 total, 4 dead)` 

`Name: 10.10.1.13:50010 
Decommission Status : Normal 
Configured Capacity: 0 (0 KB) 
DFS Used: 0 (0 KB) 
Non DFS Used: 0 (0 KB) 
DFS Remaining: 0(0 KB) 
DFS Used%: 100% 
DFS Remaining%: 0% 
Last contact: Wed Feb 05 19:02:54 MSK 2014 


Name: 10.10.1.14:50010 
Decommission Status : Normal 
Configured Capacity: 0 (0 KB) 
DFS Used: 0 (0 KB) 
Non DFS Used: 0 (0 KB) 
DFS Remaining: 0(0 KB) 
DFS Used%: 100% 
DFS Remaining%: 0% 
Last contact: Wed Feb 05 19:02:54 MSK 2014 


Name: 10.10.1.16:50010 
Decommission Status : Normal 
Configured Capacity: 0 (0 KB) 
DFS Used: 0 (0 KB) 
Non DFS Used: 0 (0 KB) 
DFS Remaining: 0(0 KB) 
DFS Used%: 100% 
DFS Remaining%: 0% 
Last contact: Wed Feb 05 19:02:54 MSK 2014 


Name: 10.10.1.11:50010 
Decommission Status : Normal 
Configured Capacity: 0 (0 KB) 
DFS Used: 0 (0 KB) 
Non DFS Used: 0 (0 KB) 
DFS Remaining: 0(0 KB) 
DFS Used%: 100% 
DFS Remaining%: 0% 
Last contact: Wed Feb 05 19:02:55 MSK 2014 

並試圖在HDFS文件系統中mkdir已成功。重新啓動Hadoop守護進程沒有產生任何積極影響。

你能幫我解決這個問題嗎?謝謝。

最佳, 亞歷

+0

可能重複[HDFS錯誤:只能被複制到0節點,而不是1](http://stackoverflow.com/questions/5293446/hdfs-error-could-only-be-replicated-to-0 -nodes-instead-of-1) – vefthym

+0

看來我沒有注意到容量是0KB。不明白爲什麼? – Alexander

+0

您的數據節點未啓動,請檢查datanode日誌。 「Datanodes可用:0(總共4個,4個死亡)」 – rVr

回答

4

我發現我的問題。這個問題與已經設置爲NFS分區的hadoop.tmp.dir的配置有關。默認情況下,它被配置爲/ tmp,即本地fs。從core-site.xml中刪除hadoop.tmp.dir後,問題已解決。

相關問題