2017-04-14 91 views
2

我有一個21節點羣集(C * 2.2)的m4.2xlarges,每個包含5個1TB SSD的卷。Cassandra - 引導新節點 - 未壓縮

當它已滿50%(每個節點500GB * 5 = 2.5 TB)時,我意識到我需要更多空間,因此我添加了一個新節點。

這個新節點加入了羣集(從UJ到UN),但磁盤使用率爲4.2TB。

我想這是因爲壓實落後,等了幾天。即使發生了壓縮,磁盤使用率也沒有變化。這個新盒子真的是CPU綁定的,所以我把它放到Compute優化的c4.8xlarge盒子中,並將concurrent_compactions設置爲20,並禁用compaction_throughput節流以完成此操作。

同時,我停止了對集羣的所有寫入。待處理壓縮的數量正在增加,磁盤上的數據不會減少。

我在做什麼錯?系統時間看起來非常高。我使用org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy和 電流壓實閾值是分鐘= 4,最大= 32

當我 strace -f -c -p cassandra-pid > strace_count

% time  seconds usecs/call  calls errors syscall 
------ ----------- ----------- --------- --------- ---------------- 
49.57 7431.363672  140392  52933  17755 futex 
30.22 4530.012667  482481  9389   epoll_wait 
11.33 1697.685882  2143543  792   recvfrom 
    3.68 551.306817  1596 345500   7 write 
    3.58 537.257283 14138350  38  33 restart_syscall 
    0.78 117.381206  111262  1055   poll 
    0.28 41.738677   636  65675   lseek 
    0.14 21.138626  1659  12741   pread 
    0.10 15.189009  1838  8265   read 
    0.07 9.898101   696  14229   sched_yield 
    0.06 8.984107  23831  377   sendto 
    0.04 6.148230  9759  630   munmap 
    0.04 5.760339  21902  263   mprotect 
    0.02 3.154839   992  3181  359 fadvise64 
    0.02 3.107529   652  4769  215 stat 
    0.01 2.006363  167197  12   msync 
    0.01 1.956998  7040  278   mmap 
    0.01 1.838682  1155  1592   8 unlink 
    0.01 1.080512   602  1794   lstat 
    0.01 0.861741   578  1490   close 
    0.00 0.626903   562  1116   open 
    0.00 0.596450   588  1014   fcntl 
    0.00 0.440250   644  684   fstat 
    0.00 0.318874   630  506   epoll_ctl 
    0.00 0.249772  4625  54   fdatasync 
    0.00 0.149440  1660  90   fsync 
    0.00 0.093154   647  144   rename 
    0.00 0.069017   575  120   statfs 
    0.00 0.018136   356  51   getpriority 
    0.00 0.014358   598  24   rt_sigprocmask 
    0.00 0.011584   161  72   times 
    0.00 0.009858   616  16   setsockopt 
    0.00 0.009396   940  10   link 
    0.00 0.008072   24  336   7 rt_sigreturn 
    0.00 0.004960  1240   4   getsockopt 
    0.00 0.004926   411  12   sched_getaffinity 
    0.00 0.004503   500   9   dup2 
    0.00 0.002998   500   6   madvise 
    0.00 0.002693   449   6   set_robust_list 
    0.00 0.002597   433   6   accept 
    0.00 0.002000   333   6   clone 
    0.00 0.002000   500   4   2 accept4 
    0.00 0.001243   207   6   gettid 
    0.00 0.001000   500   2   writev 
    0.00 0.001000   500   2   recvmsg 
    0.00 0.001000   143   7   getsockname 
    0.00 0.001000   500   2   getpeername 
    0.00 0.001000   167   6   6 setpriority 
    0.00 0.000000   0   1   socket 
    0.00 0.000000   0   1   bind 
------ ----------- ----------- --------- --------- ---------------- 
100.00 14990.519464    529320  18392 total 

當我做頂 - 1 :

Tasks: 1506 total, 8 running, 1496 sleeping, 0 stopped, 2 zombie 
Cpu0 : 0.3%us, 47.3%sy, 10.5%ni, 41.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu1 : 0.7%us, 87.6%sy, 11.7%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu2 : 3.2%us, 65.0%sy, 0.0%ni, 31.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu3 : 11.6%us, 39.9%sy, 0.0%ni, 48.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu4 : 1.0%us, 55.3%sy, 9.2%ni, 34.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu5 : 0.3%us, 98.0%sy, 1.7%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu6 : 0.4%us, 90.7%sy, 1.4%ni, 6.8%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu7 : 3.4%us, 20.2%sy, 9.4%ni, 64.0%id, 3.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu8 : 1.7%us, 24.9%sy, 0.3%ni, 73.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu9 : 0.7%us, 79.4%sy, 0.7%ni, 18.9%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu10 : 0.7%us, 64.9%sy, 13.6%ni, 14.0%id, 6.8%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu11 : 1.0%us, 50.7%sy, 0.0%ni, 18.6%id, 29.7%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu12 : 0.3%us, 58.9%sy, 0.0%ni, 40.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu13 : 0.3%us, 72.5%sy, 26.8%ni, 0.0%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu14 : 0.0%us, 50.2%sy, 49.8%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu15 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu16 : 0.3%us, 54.2%sy, 0.0%ni, 40.5%id, 5.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu17 : 0.7%us, 46.3%sy, 19.9%ni, 24.0%id, 9.1%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu18 : 0.7%us, 68.9%sy, 0.0%ni, 30.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu19 : 5.7%us, 3.4%sy, 0.0%ni, 90.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu20 : 0.7%us, 44.4%sy, 0.0%ni, 54.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu21 : 1.3%us, 67.8%sy, 0.0%ni, 30.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu22 : 0.7%us, 45.5%sy, 7.3%ni, 42.9%id, 3.6%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu23 : 1.3%us, 22.7%sy, 0.0%ni, 75.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu24 : 0.0%us, 65.4%sy, 0.0%ni, 34.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu25 : 0.0%us, 62.0%sy, 12.2%ni, 25.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu26 : 1.3%us, 68.9%sy, 12.6%ni, 17.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu27 : 0.0%us, 64.3%sy, 12.9%ni, 22.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu28 : 0.0%us, 75.8%sy, 0.0%ni, 23.5%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu29 : 0.0%us, 60.3%sy, 1.7%ni, 37.4%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu30 : 0.3%us, 48.3%sy, 12.7%ni, 38.0%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu31 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu32 : 0.0%us, 72.1%sy, 25.2%ni, 0.0%id, 2.7%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu33 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu34 : 0.3%us, 66.7%sy, 0.0%ni, 33.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Cpu35 : 0.0%us, 67.7%sy, 0.0%ni, 32.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st 
Mem: 61820728k total, 61610932k used, 209796k free,  456k buffers 
Swap:  0k total,  0k used,  0k free, 35425968k cached 

nodetool compactionstats

 pending tasks: 281 
    id compaction type  keyspace    table  completed   total unit progress 
    id  Compaction keyspace_1   table_____4 1591902797 2851758523 bytes  55.82% 
    id  Compaction keyspace_1   table_____1  193582898  567222689 bytes  34.13% 
    id  Compaction keyspace_1   table_____2  187022078 2264168754 bytes  8.26% 
    id  Compaction keyspace_1   table_____1 22841754587 24781014960 bytes  92.17% 
    id  Compaction keyspace_1   table_____5  764633368 3904191508 bytes  19.58% 
    id  Compaction keyspace_1   table_____1 1856076066 2326634436 bytes  79.78% 
    id  Compaction keyspace_1   table_____7  254856804  499133271 bytes  51.06% 
    id  Compaction keyspace_1   table_____8 1406859449 1803885628 bytes  77.99% 
    id  Compaction keyspace_1   table_____7 1734201253 2308801656 bytes  75.11% 
    id  Compaction keyspace_1   table_____1  656195289  931867447 bytes  70.42% 
    id  Compaction keyspace_1   table_____1  657036608 1380870812 bytes  47.58% 
    id  Compaction keyspace_1   table_____1  235054945 18957522878 bytes  1.24% 
    id  Compaction keyspace_1   table____10  2351049  3552009 bytes  66.19% 
    id  Compaction keyspace_1   table_____2  810635522  867307196 bytes  93.47% 
    id  Compaction keyspace_1   table_____5  281573682  780375396 bytes  36.08% 
    id  Compaction keyspace_1   table_____6 2350396501 2398745060 bytes  97.98% 
    id  Compaction keyspace_1   table_____1  63122362  434443651 bytes  14.53% 
    id  Compaction keyspace_1   table_____3  287859748  399896319 bytes  71.98% 
    id  Compaction keyspace_1   table_____2 1776310557 2685522257 bytes  66.14% 
    id  Compaction keyspace_1   table_____1  494183426 22432529013 bytes  2.20% 

nodetool compactionhistory: 有很多線在這裏,但這裏是一個示例:

id datatype index  1492056758751    558756   540336   {1:175, 2:6} 
id datatype index  1492075503279    128269   114446   {1:1160, 2:31} 
id datatype index  1492072165446    22914902  22464994  {1:626, 2:37} 
id datatype index 1492060375419    73514456  72842367  {1:398795, 2:7294, 3:300} 
id datatype index 1492075160893    85707   64387   {1:236, 2:41} 
id datatype index  1492151303774    139172156  134666782  {1:9129, 2:3313, 3:935, 4:112} 
id datatype index 1492135037619    30839157  29690968  {1:32854, 2:5702, 3:535, 4:61} 
id datatype index 1492075521048    255030   253531   {1:220, 2:6} 
id datatype index  1492116936213    11391100  10943344  {1:6798, 2:301} 
id datatype index 1492075649703    1527580  1486442  {1:5381, 2:330} 
id datatype index   1492153054713    218401839  216306589  {1:6669, 2:1068, 3:273, 4:22} 
id datatype index 1492169550324    9172160  8724129  {1:42943, 2:2390} 
id datatype index 1492087845445    8086487  7810261  {1:8445, 2:1209, 3:95} 
id datatype index 1492116806390    837169   806946   {1:5984, 2:262} 
id datatype index 1492167939189    275277987  271618327  {1:38585, 2:18745, 3:494} 
id datatype index    277471932  266321389  {1:47184, 2:16047, 3:367, 4:468} 
id datatype index  1492116559239    1569590  1402724  {1:460, 2:62} 
id datatype index 1492173763782    83298080  81977056  {1:36383, 2:7577, 3:3565, 4:95, 6:169} 
id datatype index  1492158247355    42660621  40224352  {1:6565, 2:987, 3:316, 4:521, 6:17, 8:70} 
id datatype index  1492179061558    589874248  568901949  {1:16726, 2:9342, 3:1149, 4:141} 
id datatype index  1492190014331    807975203  786973389  {1:67311, 2:1852} 
id datatype index  1491949569125    45499223  46212100  {1:3944, 2:523, 3:1268, 4:262} 
id datatype index 1492063798113    2401   1134   {1:1, 2:3} 
id datatype index 1492100603829    7693737  7507021  {1:7112, 2:870, 3:235, 4:27} 
id datatype index 1492202653921    114122963  111721885  {1:2038, 2:2997, 3:1095, 4:48, 5:40} 
id datatype index  1492063653695    60700   50728   {1:157, 2:12} 
id datatype index 1492152115922    165656033  159591156  {1:5180, 2:3233, 3:600, 4:564, 5:37, 6:14, 7:12} 
id datatype index 1492160511587    3353867375  3280857307  {1:12265239, 2:409303, 3:16391, 4:1932} 
id datatype index  1492116638632    3226315  2863672  {1:956, 2:137} 
id datatype index 1492050334458    64407   56620   {1:447, 2:31} 
id datatype index  1492150640640    587181   424081   {1:1293, 2:218, 3:1} 
id datatype index 1492116731210    429668507  407404356  {1:2208562, 2:131875, 3:338} 
id datatype index 1492134210449    293003702  275992426  {1:7429, 2:1686, 3:165} 
id datatype index 1492171984560    8467649  8318775  {1:13330, 2:892, 3:11} 
id datatype index 1492150632348    424314   368270   {1:356, 2:72, 3:8} 
id datatype index   1492068676918    677842865  653983357  {1:11042, 2:405} 
id datatype index 1492160695008    11985228  11689655  {1:3684, 2:1390, 3:441, 4:87} 
id datatype index    5906438  5731218  {1:7040, 2:445, 3:27} 
id datatype index  1492132529903    234019313  220261439  {1:80014, 2:5316} 
id datatype index    1646302  1634070  {1:575, 2:17, 3:5} 
id datatype index 1492145903652    1544764  1527844  {1:1807, 2:295, 3:65, 4:3, 5:6} 
id datatype index 1492075180569    1034277  986605   {1:6591, 2:235} 
id datatype index 1491928723944    5823014  5811907  {1:6498} 
id datatype index 1492075323943    573147   526857   {1:4395, 2:250} 
+0

你使用的是什麼類型的壓實?什麼版本的C * 可以提供'nodetool compactionhistory'也是你目前的壓縮閾值? –

+0

我正在使用SizeTieredCompactionStrategy,並且當前的壓實閾值是min = 4,max = 32也在 – emraldinho1986

回答

1

你的新節點應該關閉壓實差距最終...

CPU是不是唯一綁定compactions,檢查compaction_throughput_mb_per_sec PARAM,並查看這篇文章: https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsConfigureCompaction.html

請檢查您的nodetool compactionstats,看看是否pendi數量隨着時間的推移任務減少。此外,請在此處附上nodetool cfstats的輸出。

作爲替代方案,你可以嘗試重新添加新節點auto_bootstrap關閉和運行nodetool重建之後,並修復最近,它應該是你的情況要快。

編輯:

審查您compactionstats後 - 儘量減少concurrent_compactors屬性設置爲較低的值。這需要更多時間來執行,但對整體集羣性能的影響應該較小。

+0

之上添加了compactionhistory這是有道理的,有時它只需要花費多長時間。 –

0

如果您發現bytes_inbytes_out爲您完成的交易,沒有太多的這就是爲什麼即使在完成這麼多compactions後,你不會看到你的磁盤空間利用率急劇變化的缺口。

注意:你也應該考慮使用剷平壓縮策略,如果它適合你的使用情況,因爲它擁有超過尺寸層次許多優點。平坦壓實通常最適合大多數使用情況。這裏有一個很好的描述何時使用其中一個的塊。 http://www.datastax.com/dev/blog/when-to-use-leveled-compaction