2014-02-17 56 views
22

在執行批量數據加載時,基於日誌數據遞增計數器時,遇到超時異常。我使用Datastax 2.0-rc2 java驅動程序。cassandra datastax驅動程序拋出的寫入超時

這是服務器無法跟上的問題(即服務器端配置問題),還是這是客戶端無聊等待服務器響應的問題?無論哪種方式,是否有一個簡單的配置更改,我可以做到這一點將解決這個問題?

Exception in thread "main" com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write) 
    at com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:54) 
    at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:271) 
    at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:187) 
    at com.datastax.driver.core.Session.execute(Session.java:126) 
    at jason.Stats.analyseLogMessages(Stats.java:91) 
    at jason.Stats.main(Stats.java:48) 
Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write) 
    at com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:54) 
    at com.datastax.driver.core.Responses$Error.asException(Responses.java:92) 
    at com.datastax.driver.core.ResultSetFuture$ResponseCallback.onSet(ResultSetFuture.java:122) 
    at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:224) 
    at com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:373) 
    at com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:510) 
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) 
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) 
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) 
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) 
    at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70) 
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) 
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) 
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) 
    at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462) 
    at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443) 
    at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) 
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) 
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) 
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) 
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) 
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) 
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) 
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109) 
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) 
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90) 
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) 
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) 
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
    at java.lang.Thread.run(Thread.java:744) 
Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write) 
    at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:53) 
    at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:33) 
    at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:165) 
    at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66) 
    ... 21 more 

其中一個節點報告其發生這在大致時間:

ERROR [Native-Transport-Requests:12539] 2014-02-16 23:37:22,191 ErrorMessage.java (line 222) Unexpected exception during request 
java.io.IOException: Connection reset by peer 
    at sun.nio.ch.FileDispatcherImpl.read0(Native Method) 
    at sun.nio.ch.SocketDispatcher.read(Unknown Source) 
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source) 
    at sun.nio.ch.IOUtil.read(Unknown Source) 
    at sun.nio.ch.SocketChannelImpl.read(Unknown Source) 
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64) 
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109) 
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) 
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90) 
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
    at java.lang.Thread.run(Unknown Source) 

回答

23

雖然我不明白這個問題的根本原因,我能夠通過提高解決問題超時值在conf/cassandra.yaml文件中。

write_request_timeout_in_ms: 20000 
+0

我曾經遇到同樣的問題。我正在使用'BatchStatement'在Cassnadra中寫入數據。我的批量大小是10000.在縮小這個批量後,我沒有遇到異常情況。所以,也許你正試圖在一個請求中將大量數據加載到Cassandra中。 –

+0

這實際上是很差的選擇。你可能知道爲什麼會發生這種情況,因爲我現在面臨同樣的錯誤。 –

+0

@Superbrain_bug感謝您分享您對此解決方案的判斷。我相信有些人會覺得你的判斷很有趣。如果你找到這個問題的替代解決方案,我相信每個人都想知道這個問題。 – Jacob

0

它是協調器(所以服務器)超時等待寫入的確認。

+0

嗨克里斯,我怎麼能進一步調試,找出爲什麼ACK沒有來?我面臨着類似的問題,並試圖找到根本原因......謝謝。 – opstalj

13

我們在附有SAN存儲的ESX羣集中的單個節點上遇到類似問題(這是not recommended by datastax,但我們目前沒有其他選項)。

注:下面的設置,可以是一個很大的打擊最大性能卡桑德拉可以實現,但我們選擇了一個穩定的系統在高性能。

在運行iostat -xmt 1時,我們在發生WriteTimeoutExceptions的同時發現了很高的w_await時間。事實證明,memtable無法在默認的write_request_timeout_in_ms: 2000設置中寫入磁盤。

我們顯著從512Mb的減小的memTable大小(默認爲堆空間,這是2Gb的在我們的情況下25%)至32Mb的:

# Total permitted memory to use for memtables. Cassandra will stop 
# accepting writes when the limit is exceeded until a flush completes, 
# and will trigger a flush based on memtable_cleanup_threshold 
# If omitted, Cassandra will set both to 1/4 the size of the heap. 
# memtable_heap_space_in_mb: 2048 
memtable_offheap_space_in_mb: 32 

我們還略微increated寫入超時時間爲3秒:

write_request_timeout_in_ms: 3000 

另外,還要確保您定期寫入到磁盤,如果你有高IO等待時間:

#commitlog_sync: batch 
#commitlog_sync_batch_window_in_ms: 2 
# 
# the other option is "periodic" where writes may be acked immediately 
# and the CommitLog is simply synced every commitlog_sync_period_in_ms 
# milliseconds. 
commitlog_sync: periodic 
commitlog_sync_period_in_ms: 10000 

這些設置允許memtable保持較小並且經常寫入。例外解決了,我們在系統上運行的壓力測試中倖存下來。

-1

它值得雙重檢查您的卡桑德拉GC設置。

在我的情況下,我使用信號量來抑制異步寫入,但仍然(有時)超時。

它發生了,我使用不合適的GC設置,我一直在使用cassandra單位爲了方便,它有默認的VM設置運行的意外結果。因此,我們最終會觸發一個停止世界的GC,導致寫入超時。應用與我運行的cassandra docker鏡像相同的GC設置,一切正常。

這可能是一個不常見的原因,但它會幫助我,所以它似乎值得在這裏記錄。

相關問題