2017-04-04 197 views
2

我想要將文件從外部Windows服務器上傳到不同服務器中的Hdfs。 Hdfs是該服務器中cloudera docker容器的一部分。將文件從服務器上傳到另一臺服務器中的Hdfs

我連接到HDFS從Windows服務器如下:

Configuration conf = new Configuration(); 
conf.set("fs.defaultFS", "hdfs://%HDFS_SERVER_IP%:8020"); 
fs = FileSystem.get(conf); 

當我打電話fs.copyFromLocalFile(localFilePath, hdfsFilePath);,它會拋出異常下方,並創建文件,而無需在HDFS中的任何內容。 :

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/test/test.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation. 
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1595) 
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3287) 
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:677) 
    at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:213) 
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:485) 
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) 
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) 
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) 
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) 
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) 
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) 

    at org.apache.hadoop.ipc.Client.call(Client.java:1475) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1412) 
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) 
    at com.sun.proxy.$Proxy15.addBlock(Unknown Source) 
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:498) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) 
    at com.sun.proxy.$Proxy16.addBlock(Unknown Source) 
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1455) 
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1251) 
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448) 

而且似乎在數據節點的問題,以下是從它的日誌複製:

重試連接到服務器:0.0.0.0/0.0.0.0:8022。已經嘗試過0 次(s);重試的政策是 RetryUpToMaximumCountWithFixedSleep(maxRetries = 10,休眠時間= 1000 毫秒)

我格式化的數據節點,並重新啓動HDFS但仍無法上傳文件在這種情況下。除了閱讀,寫文件等其他功能,配置文件也可以在本地系統和Hdfs在同一臺服務器上傳輸文件。

服務器連接到代理服務器,我配置了Hdfs的Docker容器的代理環境。通過在不同服務器之間使用Hdfs Java Api來傳輸文件如何?

更新1:

HDFS dfsadmin -report:

17/04/05 07:14:02 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032 
Total Nodes:1 
     Node-Id    Node-State Node-Http-Address  Number-of-Running-Containers 
quickstart.cloudera:37449    RUNNING quickstart.cloudera:8042         0 
[[email protected] conf]# hdfs dfsadmin -report 
Configured Capacity: 211243687936 (196.74 GB) 
Present Capacity: 78773199014 (73.36 GB) 
DFS Remaining: 77924307110 (72.57 GB) 
DFS Used: 848891904 (809.57 MB) 
DFS Used%: 1.08% 
Under replicated blocks: 0 
Blocks with corrupt replicas: 0 
Missing blocks: 0 
Missing blocks (with replication factor 1): 0 

------------------------------------------------- 
Live datanodes (1): 

Name: XXXX:50010 (quickstart.cloudera) 
Hostname: quickstart.cloudera 
Decommission Status : Normal 
Configured Capacity: 211243687936 (196.74 GB) 
DFS Used: 848891904 (809.57 MB) 
Non DFS Used: 132470488922 (123.37 GB) 
DFS Remaining: 77924307110 (72.57 GB) 
DFS Used%: 0.40% 
DFS Remaining%: 36.89% 
Configured Cache Capacity: 0 (0 B) 
Cache Used: 0 (0 B) 
Cache Remaining: 0 (0 B) 
Cache Used%: 100.00% 
Cache Remaining%: 0.00% 
Xceivers: 6 
Last contact: Wed Apr 05 07:15:00 UTC 2017 

紗線節點-list -all:

17/04/05 07:14:02 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032 
Total Nodes:1 
     Node-Id    Node-State Node-Http-Address  Number-of-Running-Containers 
quickstart.cloudera:37449    RUNNING quickstart.cloudera:8042         0 

芯-site.xml中:

<?xml version="1.0"?> 
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 

<configuration> 
    <property> 
    <name>fs.defaultFS</name> 
    <value>hdfs://quickstart.cloudera:8020</value> 
    </property> 

    <!-- OOZIE proxy user setting --> 
    <property> 
    <name>hadoop.proxyuser.oozie.hosts</name> 
    <value>*</value> 
    </property> 
    <property> 
    <name>hadoop.proxyuser.oozie.groups</name> 
    <value>*</value> 
    </property> 

    <!-- HTTPFS proxy user setting --> 
    <property> 
    <name>hadoop.proxyuser.httpfs.hosts</name> 
    <value>*</value> 
    </property> 
    <property> 
    <name>hadoop.proxyuser.httpfs.groups</name> 
    <value>*</value> 
    </property> 

    <!-- Llama proxy user setting --> 
    <property> 
    <name>hadoop.proxyuser.llama.hosts</name> 
    <value>*</value> 
    </property> 
    <property> 
    <name>hadoop.proxyuser.llama.groups</name> 
    <value>*</value> 
    </property> 

    <!-- Hue proxy user setting --> 
    <property> 
    <name>hadoop.proxyuser.hue.hosts</name> 
    <value>*</value> 
    </property> 
    <property> 
    <name>hadoop.proxyuser.hue.groups</name> 
    <value>*</value> 
    </property> 

</configuration> 

HDFS-site.xml中:

<?xml version="1.0"?> 
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 

<configuration> 
    <property> 
    <name>dfs.replication</name> 
    <value>1</value> 
    </property> 
    <!-- Immediately exit safemode as soon as one DataNode checks in. 
     On a multi-node cluster, these configurations must be removed. --> 
    <property> 
    <name>dfs.safemode.extension</name> 
    <value>0</value> 
    </property> 
    <property> 
    <name>dfs.safemode.min.datanodes</name> 
    <value>1</value> 
    </property> 
    <property> 
    <name>dfs.permissions.enabled</name> 
    <value>false</value> 
    </property> 
    <property> 
    <name>dfs.permissions</name> 
    <value>false</value> 
    </property> 
    <property> 
    <name>dfs.safemode.min.datanodes</name> 
    <value>1</value> 
    </property> 
    <property> 
    <name>dfs.webhdfs.enabled</name> 
    <value>true</value> 
    </property> 
    <property> 
    <name>hadoop.tmp.dir</name> 
    <value>/var/lib/hadoop-hdfs/cache/${user.name}</value> 
    </property> 
    <property> 
    <name>dfs.namenode.name.dir</name> 
    <value>/var/lib/hadoop-hdfs/cache/${user.name}/dfs/name</value> 
    </property> 
    <property> 
    <name>dfs.namenode.checkpoint.dir</name> 
    <value>/var/lib/hadoop-hdfs/cache/${user.name}/dfs/namesecondary</value> 
    </property> 
    <property> 
    <name>dfs.datanode.data.dir</name> 
    <value>/var/lib/hadoop-hdfs/cache/${user.name}/dfs/data</value> 
    </property> 
    <property> 
    <name>dfs.namenode.rpc-bind-host</name> 
    <value>0.0.0.0</value> 
    </property> 

    <property> 
    <name>dfs.namenode.servicerpc-address</name> 
    <value>0.0.0.0:8022</value> 
    </property> 
    <property> 
    <name>dfs.https.address</name> 
    <value>0.0.0.0:50470</value> 
    </property> 
    <property> 
    <name>dfs.namenode.http-address</name> 
    <value>0.0.0.0:50070</value> 
    </property> 
    <property> 
    <name>dfs.datanode.address</name> 
    <value>0.0.0.0:50010</value> 
    </property> 
    <property> 
    <name>dfs.datanode.ipc.address</name> 
    <value>0.0.0.0:50020</value> 
    </property> 
    <property> 
    <name>dfs.datanode.http.address</name> 
    <value>0.0.0.0:50075</value> 
    </property> 
    <property> 
    <name>dfs.datanode.https.address</name> 
    <value>0.0.0.0:50475</value> 
    </property> 
    <property> 
    <name>dfs.namenode.secondary.http-address</name> 
    <value>0.0.0.0:50090</value> 
    </property> 
    <property> 
    <name>dfs.namenode.secondary.https-address</name> 
    <value>0.0.0.0:50495</value> 
    </property> 

    <!-- Impala configuration --> 
    <property> 
    <name>dfs.datanode.hdfs-blocks-metadata.enabled</name> 
    <value>true</value> 
    </property> 
    <property> 
    <name>dfs.client.file-block-storage-locations.timeout.millis</name> 
    <value>10000</value> 
    </property> 
    <property> 
    <name>dfs.client.read.shortcircuit</name> 
    <value>true</value> 
    </property> 
    <property> 
    <name>dfs.domain.socket.path</name> 
    <value>/var/run/hadoop-hdfs/dn._PORT</value> 
    </property> 
</configuration> 
+0

你從哪裏運行這段代碼?它必須位於Windows Server上。還發布完整的堆棧跟蹤。 – franklinsijo

+0

如何初始化文件系統? – Serhiy

+0

datanode是否有足夠的空間!添加'hdfs dfsadmin -report','yarn node -list -all'和'core-site.xml','hdfs-site.xml'屬性的輸出。 – franklinsijo

回答

0

我只改conf.set("fs.defaultFS", "hdfs://%HDFS_SERVER_IP%:8020")conf.set("fs.defaultFS", "webhdfs://%HDFS_SERVER_IP%:50070"),然後我成功地上傳文件到HDFS在不同的服務器。我提到這個link

1

RPC端口的屬性fs.defaultFScore-site.xmlhdfs-site.xmldfs.namenode.servicerpc-address之間的衝突。

將其修改爲hdfs-site.xml並重新啓動服務。

<property> 
    <name>dfs.namenode.servicerpc-address</name> 
    <value>0.0.0.0:8020</value> 
</property> 
+0

我修改並且namenode無法初始化。我將該屬性的名稱更改爲dfs.namenode.rpc-address,並且仍然有相同的異常。 org.apache.hadoop.ipc.Client:重試連接到服務器:0.0.0.0/0.0.0.0:8020。仍然在datanode的日誌中。 – isspek

+0

你得到的錯誤是什麼? – franklinsijo

+0

它們是RemoteException,與問題和org.apache.hadoop.ipc.Client中描述的相同:重試連接到服務器:數據節點日誌中的0.0.0.0/0.0.0.0:8020 – isspek

相關問題