我們使用Cassandra 2.0.3六個節點與兩個數據中心(每個節點有3個節點)。我們的其中一個節點經常會出現下面的連續錯誤。Cassandra節點下降與AssertionError,IOException(斷開管道)和OutOfMemoryError(堆)
- java.lang.AssertionError
- java.io.IOException異常:殘破的管道
- java.lang.OutOfMemoryError:Java堆空間
ERROR [Native-Transport-Requests:4836102] 2016-01-26 18:50:41,905 ErrorMessage.java (line 222) Unexpected exception during request
java.lang.AssertionError: /172.31.x.x
at org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:919)
at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:534)
at org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:578)
at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:171)
at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:156)
at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188)
at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222)
at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304)
at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43)
at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
WARN [GossipTasks:1] 2016-01-26 18:50:46,742 Gossiper.java (line 612) Gossip stage has 35 pending tasks; skipping status check (no nodes will be marked down)
INFO [ScheduledTasks:1] 2016-01-26 18:50:41,905 StatusLogger.java (line 70) ReadStage 32 163 32554805 0 0
ERROR [Native-Transport-Requests:4836140] 2016-01-26 18:52:19,743 ErrorMessage.java (line 222) Unexpected exception during request
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:51)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
at org.jboss.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:203)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:202)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:152)
at org.jboss.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:335)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
ERROR [ReplicateOnWriteStage:178749] 2016-01-26 18:52:19,673 CassandraDaemon.java (line 187) Exception in thread Thread[ReplicateOnWriteStage:178749,5,main]
java.lang.OutOfMemoryError: Java heap space
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.(CompressedRandomAccessReader.java:79)
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
at org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1195)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.createFileDataInput(SSTableNamesIterator.java:96)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:109)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:62)
at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:87)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306)
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:332)
at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:55)
at org.apache.cassandra.db.CounterMutation.makeReplicationMutation(CounterMutation.java:100)
at org.apache.cassandra.service.StorageProxy$8$1.runMayThrow(StorageProxy.java:1134)
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1936)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
可否請你讓我知道背後的原因
1)StorageProxy.submitHint中的java.lang.AssertionError。這是否與在單個節點中累積的更多提示有關?
2)java.io.IOException:斷開的管道。這裏丟失了什麼套接字連接?這也是由於提示?
3)java.lang.OutOfMemoryError:Java堆空間。這是由於更多的關鍵緩存/索引在內存或其他?
EDIT >>>>>>>>>>堆分析器 結果:
泄漏嫌疑係統概述
click here to view memory usage in graph
問題可疑1
One instance of "org.apache.cassandra.db.ColumnFamilyStore" loaded by "sun.misc.Launcher$AppClassLoader @ 0x613e088a0" occupies 2,118,988,808 (24.81%) bytes. The memory is accumulated in one instance of "com.google.common.collect.RegularImmutableSet" loaded by "sun.misc.Launcher$AppClassLoader @ 0x613e088a0".
Keywords
org.apache.cassandra.db.ColumnFamilyStore
com.google.common.collect.RegularImmutableSet
sun.misc.Launcher$AppClassLoader @ 0x613e088a0
問題可疑2
616 instances of "java.lang.Thread", loaded by "<system class loader>" occupy 3,523,257,240 (41.25%) bytes.
Biggest instances:
java.lang.Thread @ 0x638299c08 ReadStage:632 - 134,347,752 (1.57%) bytes.
java.lang.Thread @ 0x63b00a200 ReadStage:637 - 134,280,896 (1.57%) bytes.
java.lang.Thread @ 0x634369e48 ReadStage:653 - 134,280,880 (1.57%) bytes.
java.lang.Thread @ 0x63a414ff0 ReadStage:635 - 134,280,880 (1.57%) bytes.
java.lang.Thread @ 0x63bba72d0 ReadStage:641 - 134,280,880 (1.57%) bytes.
java.lang.Thread @ 0x637262780 ReadStage:628 - 133,078,144 (1.56%) bytes.
java.lang.Thread @ 0x634a218c0 ReadStage:654 - 132,945,704 (1.56%) bytes.
java.lang.Thread @ 0x638299d40 ReadStage:633 - 131,541,840 (1.54%) bytes.
java.lang.Thread @ 0x6372626b0 ReadStage:659 - 127,398,120 (1.49%) bytes.
java.lang.Thread @ 0x63beb4318 ReadStage:648 - 123,054,816 (1.44%) bytes.
java.lang.Thread @ 0x638299ba0 ReadStage:631 - 119,311,688 (1.40%) bytes.
java.lang.Thread @ 0x63801ec40 ReadStage:630 - 117,908,592 (1.38%) bytes.
java.lang.Thread @ 0x63bba7408 ReadStage:644 - 114,369,344 (1.34%) bytes.
java.lang.Thread @ 0x638732808 ReadStage:656 - 113,170,032 (1.32%) bytes.
java.lang.Thread @ 0x6387314f8 ReadStage:655 - 111,166,192 (1.30%) bytes.
java.lang.Thread @ 0x63beb42b0 ReadStage:647 - 103,886,728 (1.22%) bytes.
java.lang.Thread @ 0x6372627e8 ReadStage:629 - 92,193,112 (1.08%) bytes.
java.lang.Thread @ 0x63b893688 ReadStage:638 - 86,715,880 (1.02%) bytes.
Keywords
java.lang.Thread
Details »
問題可疑3
4,072 instances of "byte[]", loaded by "<system class loader>" occupy 1,810,992,728 (21.20%) bytes.
Keywords
byte[]
症狀看起來你有一些撲節點。你可以給硬件conf(磁盤規格,CPU規格和內存量)? – doanduyhai