2015-12-30 57 views
1

下面是CM上報告的健康問題的快照。列表中的datanode不斷變化。從數據管理部日誌中的一些錯誤:Cloudera Manager健康問題:NameNode連接,Web服務器狀態

3:59:31.859 PM ERROR org.apache.hadoop.hdfs.server.datanode.DataNode 
    datanode05.hadoop.com:50010:DataXceiver error processing WRITE_BLOCK operation src: /10.248.200.113:45252 dest: /10.248.200.105:50010 
    java.io.IOException: Premature EOF from inputStream 
     at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) 
     at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:414) 
     at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:635) 
     at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:564) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:103) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:67) 
     at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221) 
     at java.lang.Thread.run(Thread.java:662) 
5:46:03.606 PM INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
    Exception for BP-846315089-10.248.200.4-1369774276029:blk_-780307518048042460_200374997 
    java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.248.200.105:50010 remote=/10.248.200.122:43572] 
     at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:165) 
     at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:156) 
     at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:129) 
     at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) 
     at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) 
     at java.io.BufferedInputStream.read(BufferedInputStream.java:317) 
     at java.io.DataInputStream.read(DataInputStream.java:132) 
     at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) 
     at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:414) 
     at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:635) 
     at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:564) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:103) 
     at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:67) 
     at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221) 
     at java.lang.Thread.run(Thread.java:662) 

快照:

Health Issues reported on CM

我無法找出問題的根源。我可以手動從一個datanode連接到另一個沒有問題,我不相信這是一個網絡問題。此外,缺少的塊和不足複製的塊計數也會發生變化(向下調高&)。

Cloudera的經理:Cloudera的標準4.8.1

CDH 4.7

解決這個問題,表示讚賞任何幫助。

更新:2016年1月1日

對於列爲壞的數據節點,當我看到dadanode日誌,我看到這條消息了很多...

11:58:30.066 AM INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
Receiving BP-846315089-10.248.200.4-1369774276029:blk_-706861374092956879_36606459 src: /10.248.200.123:56795 dest: /10.248.200.112:50010 

爲什麼這個數據節點接收很多來自其他數據節點的塊在同一時間?看起來由於這個活動,datanode無法及時響應namenode請求,因此超時。所有不好的datanodes都顯示相同的模式。

回答

0

類似的問題獲得解答

hdfs data node disconnected from namenode
請檢查您的防火牆。使用

telnet ipaddress port 

檢查連通性。

+0

我試過telnet,它連接到其他節點成功。 它似乎不是防火牆問題。現在列爲連接問題的節點...在幾分鐘後不會顯示在列表中。它的開關。 – scott