2013-04-09 157 views
2

我想對單個節點上的Cassandra實例(v1.1.10)執行簡單的寫入操作。我只是想看看它如何處理常量寫入,以及它是否能跟上寫入速度。如何處理AllServersUnavailable異常

pool = ConnectionPool('testdb') 
test_cf = ColumnFamily(pool,'test') 
test2_cf = ColumnFamily(pool,'test2') 
test3_cf = ColumnFamily(pool,'test3') 
test_batch = test_cf.batch(queue_size=1000) 
test2_batch = test2_cf.batch(queue_size=1000) 
test3_batch = test3_cf.batch(queue_size=1000) 

chars=string.ascii_uppercase 
counter = 0 
while True: 
    counter += 1 
    uid = uuid.uuid1() 
    junk = ''.join(random.choice(chars) for x in range(50)) 
    test_batch.insert(uid, {'junk':junk}) 
    test2_batch.insert(uid, {'junk':junk}) 
    test3_batch.insert(uid, {'junk':junk}) 
    sys.stdout.write(str(counter)+'\n') 

pool.dispose() 

代碼保持長寫(當計數器爲10M左右+)粉碎後,出現以下消息

pycassa.pool.AllServersUnavailable: An attempt was made to connect to each of the servers twice, but none of the attempts succeeded. The last failure was timeout: timed out

我設置queue_size=100這並沒有幫助。我也發射了cqlsh -3控制檯截斷表腳本後墜毀,並得到了以下錯誤:

Unable to complete request: one or more nodes were unavailable.

尾礦/var/log/cassandra/system.log沒有給出錯誤的跡象,但信息上壓實,FlushWriter等。我究竟做錯了什麼?

+0

你看見那個節點上過多的CPU或磁盤使用情況?可能JVM垃圾回收處理不好,儘管我希望日誌能夠顯示相關內容。 – 2013-04-15 23:09:14

回答

0

我也有過這個問題 - 正如@ tyler-hobbs在他的評論中提出的那樣,節點可能超載(這是爲了我)。我用過的一個簡單的解決方法是退後,讓節點趕上。我已經重寫了上面的循環來捕捉錯誤,睡一會兒再試一次。我已經針對單個節點集羣運行了這個工具,它可以處理暫停(一分鐘)和週期性退出(連續不超過5次)。使用這個腳本不會丟失任何數據,除非錯誤連續五次拋出(在這種情況下,您可能想要努力失敗而不是返回循環)。

while True: 
    counter += 1 
    uid = uuid.uuid1() 
    junk = ''.join(random.choice(chars) for x in range(50)) 
    tryCount = 5 # 5 is probably unnecessarily high 
    while tryCount > 0: 
    try: 
     test_batch.insert(uid, {'junk':junk}) 
     test2_batch.insert(uid, {'junk':junk}) 
     test3_batch.insert(uid, {'junk':junk}) 
     tryCount = -1 
    except pycassa.pool.AllServersUnavailable as e: 
     print "Trying to insert [" + str(uid) + "] but got error " + str(e) + " (attempt " + str(tryCount) + "). Backing off for a minute to let Cassandra settle down" 
     time.sleep(60) # A delay of 60s is probably unnecessarily high 
     tryCount = tryCount - 1 
    sys.stdout.write(str(counter)+'\n') 

我添加a complete gist here