2012-09-16 64 views
1

我有一個Java應用程序,讀取數據從一個分區Oracle表(一對夫婦列的實際,圍繞100G大致大小)的所謂穩定時,並加載到一個卡桑德拉集羣使用線程數等於分區數。監視進程的線程顯示每個線程(插入@ ?? ms/rec的行)的進度,如下所示:節儉的卡桑德拉插入/ batch_mutate增加一個循環

問題是,無論我使用哪種API方法(insert/batch_mutate) ,等待時間在穩步增加。正如你所看到的,它始於小於10ms /記錄並且穩定持續地上升。任何猜測可能是什麼原因?

PS:由於某些原因我選擇了原始的客戶端,除非是唯一可用的解決方案,否則我無法訪問更高級別的客戶端。無論如何,我對這種奇怪的行爲感到好奇。從監控線程

樣本輸出:

[email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec,        [email protected] ms/rec, 0, 0, . 
[email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, 0, 0, . 
[email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, 0, 0, . 
[email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, 0, 0, . 
[email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, 0, 0, . 
[email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, [email protected] ms/rec, 0, 0, . 

回答

0

你爲什麼不建立在循環中的所有刀片和嘗試後,「batch_mutate」? 我猜測,性能會更好,你不需要赫克託使用「batch_mutate」(以節儉也是可用的,較低的水平實現)。 也許這可能是由於Hector的實施。

+0

您好,感謝您的答覆。 我不知道,batch_mutate可用於在同一時間發生變異了一系列的記錄。我今天會看看文檔。如果您有鏈接,請在此發佈。 –

+0

要求太多的幫助,但不勝感激示例代碼片段。 :) –

+0

嗨,它爲我工作(不要擔心日本的意見,這是很容易理解):http://d.hatena.ne.jp/akishin999/20100507/1273254065 – arutaku