2017-05-12 22 views
0

我們正在使用Titan-1.0.0支持備份爲cassandra 2.1.7。該集羣由36個VM組成,每個16GB RAM(6GB堆積)和16個CPU核心。我們正在使用SSD磁盤來處理cassandra數據,而正常的硬盤使用commitlogs。 RF爲3,讀取/寫入每個CF = 2。 Java版本爲1.8.0_45通過此羣集提供重讀。我們正面臨一些服務器的隨機停機,服務器停機一天的時間間隔約爲兩倍。讓我分享一個特定的實例,大概10-May-2017 04:00 AM,有太多的READ/MUTATION下降,我們不得不重新啓動服務器。以下是我們在服務器故障時收集的一些參數:Titan/Cassandra讀取可伸縮性,隨機服務器關閉或丟失信息

有人可以請幫助我們一些指針,至於這裏出了什麼問題,以及可以如何穩定服務器。

  • gc.log
 

    2017-05-11T21:09:44.018+0530: 1900952.543: Total time for which application threads were stopped: 0.0156063 seconds, Stopping threads took: 0.0125466 seconds 2017-05-11T21:09:45.021+0530: 1900953.546: Total time for which application threads were stopped: 0.0031342 seconds, Stopping threads took: 0.0006697 seconds 2017-05-11T21:09:46.025+0530: 1900954.550: Total time for which application threads were stopped: 0.0037100 seconds, Stopping threads took: 0.0011525 seconds 2017-05-11T21:09:47.031+0530: 1900955.556: Total time for which application threads were stopped: 0.0057972 seconds, Stopping threads took: 0.0030861 seconds 2017-05-11T21:09:48.034+0530: 1900956.559: Total time for which application threads were stopped: 0.0029592 seconds, Stopping threads took: 0.0007129 seconds 2017-05-11T21:09:49.061+0530: 1900957.586: Total time for which application threads were stopped: 0.0268731 seconds, Stopping threads took: 0.0242117 seconds 2017-05-11T21:10:03.095+0530: 1900971.620: Total time for which application threads were stopped: 0.0045643 seconds, Stopping threads took: 0.0009709 seconds 2017-05-11T21:10:04.099+0530: 1900972.624: Total time for which application threads were stopped: 0.0033764 seconds, Stopping threads took: 0.0007673 seconds 2017-05-11T21:10:07.104+0530: 1900975.629: Total time for which application threads were stopped: 0.0056445 seconds, Stopping threads took: 0.0019495 seconds 2017-05-11T21:10:07.447+0530: 1900975.972: Total time for which application threads were stopped: 0.0045721 seconds, Stopping threads took: 0.0009695 seconds {Heap before GC invocations=85258 (full 0): garbage-first heap total 6291456K, used 5059586K [0x0000000640000000, 0x0000000640206000, 0x00000007c0000000) region size 2048K, 1844 young (3776512K), 105 survivors (215040K) Metaspace  used 36253K, capacity 36654K, committed 36784K, reserved 1081344K class space used 3844K, capacity 3965K, committed 4016K, reserved 1048576K 2017-05-11T21:10:07.453+0530: 
    1900975.978: [GC pause (GCLocker Initiated GC) (young) Desired survivor size 242221056 bytes, new threshold 15 (max 15) 
    - age 1: 182659896 bytes, 182659896 total 
    - age 2: 21418160 bytes, 204078056 total 
    - age 3:  871016 bytes, 204949072 total 
    - age 4: 3600512 bytes, 208549584 total 
    - age 5: 1313096 bytes, 209862680 total 
    - age 6:  21152 bytes, 209883832 total , 0.0972867 secs] [Parallel Time: 83.6 ms, GC Workers: 13] 
      [GC Worker Start (ms): Min: 1900975980.9, Avg: 1900975986.0, Max: 1900975989.4, Diff: 8.5] 
      [Ext Root Scanning (ms): Min: 8.5, Avg: 11.9, Max: 17.4, Diff: 8.9, Sum: 154.3] 
      [Update RS (ms): Min: 26.8, Avg: 30.8, Max: 34.2, Diff: 7.4, Sum: 399.8] 
      [Processed Buffers: Min: 31, Avg: 87.3, Max: 144, Diff: 113, Sum: 1135] 
      [Scan RS (ms): Min: 0.1, Avg: 2.8, Max: 6.2, Diff: 6.1, Sum: 36.6] 
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.3, Sum: 0.9] 
      [Object Copy (ms): Min: 31.5, Avg: 32.3, Max: 33.2, Diff: 1.7, Sum: 419.9] 
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] 
      [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.3, Sum: 3.1] 
      [GC Worker Total (ms): Min: 74.7, Avg: 78.1, Max: 83.3, Diff: 8.7, Sum: 1014.7] 
      [GC Worker End (ms): Min: 1900976063.9, Avg: 1900976064.1, Max: 1900976064.3, Diff: 0.3] [Code Root Fixup: 0.1 ms] [Code Root Purge: 0.0 ms] [Clear CT: 1.4 ms] [Other: 12.1 ms] 
      [Choose CSet: 0.0 ms] 
      [Ref Proc: 1.0 ms] 
      [Ref Enq: 0.0 ms] 
      [Redirty Cards: 1.6 ms] 
      [Humongous Reclaim: 0.4 ms] 
      [Free CSet: 4.5 ms] [Eden: 3478.0M(3476.0M)->0.0B(252.0M) Survivors: 210.0M->54.0M Heap: 4941.5M(6144.0M)->1175.0M(6144.0M)] Heap after GC invocations=85259 (full 0): garbage-first heap total 6291456K, used 1203201K [0x0000000640000000, 0x0000000640206000, 0x00000007c0000000) region size 2048K, 27 young (55296K), 27 survivors (55296K) Metaspace  used 36253K, capacity 36654K, committed 36784K, reserved 1081344K class space used 3844K, capacity 3965K, committed 4016K, reserved 1048576K } [Times: user=0.58 sys=0.21, real=0.09 secs] 2017-05-11T21:10:07.551+0530: 
    1900976.076: Total time for which application threads were stopped: 0.1015506 seconds, Stopping threads took: 0.0004470 seconds 2017-05-11T21:10:08.557+0530: 1900977.082: Total time for which application threads were stopped: 0.0053236 seconds, Stopping threads took: 0.0010397 seconds {Heap before GC invocations=85259 (full 0): garbage-first heap total 6291456K, used 1461249K [0x0000000640000000, 0x0000000640206000, 0x00000007c0000000) region size 2048K, 153 young (313344K), 27 survivors (55296K) Metaspace  used 36253K, capacity 36654K, committed 36784K, reserved 1081344K class space used 3844K, capacity 3965K, committed 4016K, reserved 1048576K 2017-05-11T21:10:08.995+0530: 1900977.520: [GC pause (G1 Evacuation Pause) (mixed) Desired survivor size 20971520 bytes, new threshold 3 (max 15) 
    - age 1: 19769440 bytes, 19769440 total 
    - age 2:  801568 bytes, 20571008 total 
    - age 3: 20679216 bytes, 41250224 total 
    - age 4:  738736 bytes, 41988960 total 
    - age 5: 3536096 bytes, 45525056 total 
    - age 6: 1242648 bytes, 46767704 total 
    - age 7:  19208 bytes, 46786912 total , 0.1879873 secs] [Parallel Time: 175.3 ms, GC Workers: 13] 
      [GC Worker Start (ms): Min: 1900977524.7, Avg: 1900977525.0, Max: 1900977525.3, Diff: 0.6] 
      [Ext Root Scanning (ms): Min: 14.5, Avg: 14.8, Max: 15.3, Diff: 0.8, Sum: 192.9] 
      [Update RS (ms): Min: 7.1, Avg: 7.9, Max: 13.3, Diff: 6.1, Sum: 103.2] 
      [Processed Buffers: Min: 4, Avg: 37.0, Max: 93, Diff: 89, Sum: 481] 
      [Scan RS (ms): Min: 55.7, Avg: 60.9, Max: 61.8, Diff: 6.1, Sum: 791.9] 
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.7, Diff: 0.7, Sum: 0.9] 
      [Object Copy (ms): Min: 90.3, Avg: 90.7, Max: 91.2, Diff: 0.9, Sum: 1179.5] 
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] 
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: 0.8] 
      [GC Worker Total (ms): Min: 174.3, Avg: 174.6, Max: 174.8, Diff: 0.6, Sum: 2269.2] 
      [GC Worker End (ms): Min: 1900977699.6, Avg: 1900977699.6, Max: 1900977699.7, Diff: 0.1] [Code Root Fixup: 0.2 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.9 ms] [Other: 11.5 ms] 
      [Choose CSet: 3.3 ms] 
      [Ref Proc: 0.4 ms] 
      [Ref Enq: 0.0 ms] 
      [Redirty Cards: 1.1 ms] 
      [Humongous Reclaim: 0.1 ms] 
      [Free CSet: 3.6 ms] [Eden: 252.0M(252.0M)->0.0B(3656.0M) Survivors: 54.0M->30.0M Heap: 1427.0M(6144.0M)->890.0M(6144.0M)] Heap after GC invocations=85260 (full 0): garbage-first heap total 6291456K, used 911310K [0x0000000640000000, 0x0000000640206000, 0x00000007c0000000) region size 2048K, 15 young (30720K), 15 survivors (30720K) Metaspace  used 36253K, capacity 36654K, committed 36784K, reserved 1081344K class space used 3844K, capacity 3965K, committed 4016K, reserved 1048576K } [Times: user=1.59 sys=0.60, real=0.19 secs] 

  • jstat
 

     S0  S1  E  O  M  CCS YGC  YGCT FGC FGCT  GCT 
     0.00 100.00 2.29 37.85 98.56 95.74 79645 6351.963  0 0.000 6351.963 
     0.00 100.00 4.63 37.85 98.56 95.74 79645 6351.963  0 0.000 6351.963 
     0.00 100.00 7.81 37.85 98.56 95.74 79645 6351.963  0 0.000 6351.963 
     0.00 100.00 10.46 37.85 98.56 95.74 79645 6351.963  0 0.000 6351.963 
     0.00 100.00 13.17 37.81 98.56 95.74 79645 6351.963  0 0.000 6351.963 
     0.00 100.00 15.30 38.11 98.56 95.74 79645 6351.963  0 0.000 6351.963 

  • /var/log/cassandra/system.log
 
WARN [SharedPool-Worker-1] 2017-05-11 21:08:49,621 SliceQueryFilter.java:319 - Read 286 live and 12978 tombstone cells in icmsgraph_10042017.edgestore for key: f800000000104600 (see tombstone_warn_threshold). 2147483647 columns were requested, slices=[b02ea000800000350080000002-b02ea000800000350080000003] 
WARN [SharedPool-Worker-1] 2017-05-11 21:08:52,647 SliceQueryFilter.java:319 - Read 286 live and 12978 tombstone cells in icmsgraph_10042017.edgestore for key: f800000000104600 (see tombstone_warn_threshold). 2147483647 columns were requested, slices=[b02ea000800000350080000002-b02ea000800000350080000003] 
WARN [GossipTasks:1] 2017-05-11 21:09:05,030 Gossiper.java:714 - Gossip stage has 1 pending tasks; skipping status check (no nodes will be marked down) 
WARN [GossipTasks:1] 2017-05-11 21:09:16,588 FailureDetector.java:249 - Not marking nodes down due to local pause of 16886451559 > 5000000000 
INFO [ScheduledTasks:1] 2017-05-11 21:09:16,954 MessagingService.java:888 - 1924 READ messages dropped in last 5000ms 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,153 StatusLogger.java:51 - Pool Name     Active Pending  Completed Blocked All Time Blocked 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,154 StatusLogger.java:66 - MutationStage      0   0  32568360   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,155 StatusLogger.java:66 - RequestResponseStage    1   0 12112153129   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,155 StatusLogger.java:66 - ReadRepairStage     0   0  593141839   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,155 StatusLogger.java:66 - CounterMutationStage    0   0    0   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,155 StatusLogger.java:66 - ReadStage      15   0 12168491480   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,155 StatusLogger.java:66 - MiscStage       0   0    0   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,156 StatusLogger.java:66 - HintedHandoff      0   1   3531   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,156 StatusLogger.java:66 - GossipStage      0   0  5798815   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,156 StatusLogger.java:66 - CacheCleanupExecutor    0   0    0   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,156 StatusLogger.java:66 - InternalResponseStage    0   0   537   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,156 StatusLogger.java:66 - CommitLogArchiver     0   0    0   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,157 StatusLogger.java:66 - CompactionExecutor    0   0  29915673   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,157 StatusLogger.java:66 - ValidationExecutor    0   0    0   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,157 StatusLogger.java:66 - MigrationStage     0   0   3990   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,157 StatusLogger.java:66 - AntiEntropyStage     0   0    0   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,158 StatusLogger.java:66 - PendingRangeCalculator   0   0   255   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,159 StatusLogger.java:66 - Sampler       0   0    0   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,159 StatusLogger.java:66 - MemtableFlushWriter    0   0   7705   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,159 StatusLogger.java:66 - MemtablePostFlush     0   0   44884   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,159 StatusLogger.java:66 - MemtableReclaimMemory    0   0   7705   0     0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,160 StatusLogger.java:75 - CompactionManager     0   0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,160 StatusLogger.java:87 - MessagingService    n/a  57/3 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,160 StatusLogger.java:97 - Cache Type      Size     Capacity    KeysToSave 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,160 StatusLogger.java:99 - KeyCache     104857584    104857600      all 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,160 StatusLogger.java:105 - RowCache       0      0      all 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,160 StatusLogger.java:112 - ColumnFamily    Memtable ops,data 

INFO [ScheduledTasks:1] 2017-05-11 21:09:17,162 StatusLogger.java:115 - system.compaction_history     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,162 StatusLogger.java:115 - system.hints        0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,162 StatusLogger.java:115 - system.IndexInfo       0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,162 StatusLogger.java:115 - system.schema_columnfamilies     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,162 StatusLogger.java:115 - system.schema_triggers     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,162 StatusLogger.java:115 - system.size_estimates   762300,93404798 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.paxos        0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.peer_events      0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.range_xfers      0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.compactions_in_progress     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.peers      283,103428 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.schema_keyspaces     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.schema_usertypes     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.local        0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.sstable_activity    315,63532 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.schema_columns      0,0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.batchlog       0,0 

INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.titan_ids     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.graphindex   1843,277464 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.system_properties_lock_     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.system_properties     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.edgestore_lock_    6,1191 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.systemlog   741,242777 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.txlog   741,242777 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.graphindex_lock_   1332,277240 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.edgestore  75171,16624309 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_05052017.system_properties_lock_     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_05052017.system_properties     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.edgestore_lock_     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.txlog     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.edgestore     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.graphindex     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.titan_ids     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.systemlog     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.graphindex_lock_     0,0 
WARN [SharedPool-Worker-24] 2017-05-11 21:09:17,200 SliceQueryFilter.java:319 - Read 286 live and 12978 tombstone cells in icmsgraph_10042017.edgestore for key: f800000000104600 (see tombstone_warn_threshold). 2147483647 columns were requested, slices=[b02ea000800000350080000002-b02ea000800000350080000003] 
WARN [SharedPool-Worker-18] 2017-05-11 21:09:17,201 SliceQueryFilter.java:319 - Read 286 live and 12978 tombstone cells in icmsgraph_10042017.edgestore for key: f800000000104600 (see tombstone_warn_threshold). 2147483647 columns were requested, slices=[b02ea000800000350080000002-b02ea000800000350080000003] 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,170 MessagingService.java:888 - 1 READ messages dropped in last 5000ms 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,170 StatusLogger.java:51 - Pool Name     Active Pending  Completed Blocked All Time Blocked 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,171 StatusLogger.java:66 - MutationStage      0   0  32568376   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,171 StatusLogger.java:66 - RequestResponseStage    0   0 12112193109   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,171 StatusLogger.java:66 - ReadRepairStage     0   0  593143878   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,171 StatusLogger.java:66 - CounterMutationStage    0   0    0   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,171 StatusLogger.java:66 - ReadStage       0   0 12168508581   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,171 StatusLogger.java:66 - MiscStage       0   0    0   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - HintedHandoff      0   1   3531   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - GossipStage      0   0  5798844   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - CacheCleanupExecutor    0   0    0   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - InternalResponseStage    0   0   537   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - CommitLogArchiver     0   0    0   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - CompactionExecutor    0   0  29915673   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - ValidationExecutor    0   0    0   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - MigrationStage     0   0   3990   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - AntiEntropyStage     0   0    0   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - PendingRangeCalculator   0   0   255   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - Sampler       0   0    0   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:66 - MemtableFlushWriter    0   0   7705   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:66 - MemtablePostFlush     0   0   44884   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:66 - MemtableReclaimMemory    0   0   7705   0     0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:75 - CompactionManager     0   0  
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:87 - MessagingService    n/a  0/0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:97 - Cache Type      Size     Capacity    KeysToSave 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:99 - KeyCache     104857584    104857600      all 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,174 StatusLogger.java:105 - RowCache       0      0      all 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:99 - KeyCache     104857584    104857600      all 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,174 StatusLogger.java:105 - RowCache       0      0      all 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,174 StatusLogger.java:112 - ColumnFamily    Memtable ops,data 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.titan_ids     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.graphindex   1844,277804 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.system_properties_lock_     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.system_properties     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.edgestore_lock_    6,1191 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.systemlog   741,242777 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.txlog   741,242777 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.graphindex_lock_   1332,277240 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.edgestore  75176,16625645 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.system_properties_lock_     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.system_properties     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.edgestore_lock_     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.txlog     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.edgestore     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.graphindex     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.titan_ids     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.systemlog     0,0 
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.graphindex_lock_     0,0 
WARN [SharedPool-Worker-34] 2017-05-11 21:09:24,177 SliceQueryFilter.java:319 - Read 286 live and 12978 tombstone cells in icmsgraph_10042017.edgestore for key: f800000000104600 (see tombstone_warn_threshold). 2147483647 columns were requested, slices=[b02ea000800000350080000002-b02ea000800000350080000003] 
  • ThreadDump 約600節儉主題:
 

    "Thrift:20118" #124110 daemon prio=5 os_prio=0 tid=0x00007feab9a71460 nid=0xb1c6 runnable [0x00007fea6a578000] 
     java.lang.Thread.State: RUNNABLE 
     at java.net.SocketInputStream.socketRead0(Native Method) 
     at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) 
     at java.net.SocketInputStream.read(SocketInputStream.java:170) 
     at java.net.SocketInputStream.read(SocketInputStream.java:141) 
     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) 
     at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) 
     at java.io.BufferedInputStream.read(BufferedInputStream.java:345) 
     - locked (a java.io.BufferedInputStream) 
     at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) 
     at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) 
     at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) 
     at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) 
     at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) 
     at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) 
     at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) 
     at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) 
     at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) 
     at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205) 
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 

72周的MessagingService - 來電線程...

 

    "MessagingService-Incoming-/192.168.33.67" #124106 prio=5 os_prio=0 tid=0x00007feab98436b0 nid=0xb0ca runnable [0x00007fea7503a000] java.lang.Thread.State: RUNNABLE  at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) 
     - locked (a java.lang.Object) at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:192) 
     - locked (a java.lang.Object) at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) 
     - locked (a sun.nio.ch.SocketAdaptor$SocketInputStream) at net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:215) at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:149)  at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:101) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:169) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:88) 

  • 特區-q
 
09:10:01 IST   5  1396  16.02  9.24  6.31   0 

回答

2

我認爲這個問題是Cassandra的墓碑

在卡桑德拉每當一個DELETE,INSERT明確null或TTL到期墓碑創建。墓碑是一種機制,它允許卡桑德拉寫得很快,但它有可支付的運營價格。

首先,墓碑本身就是記錄。它們佔用空間並可以大幅增加您需要的存儲量。

其次,大墓碑的數目就會導致延遲和堆壓

從您的卡珊德拉的系統日誌似乎約13K墓碑是爲一個頂點,纔會創建(頂點密鑰ID:0xf800000000104600)。它超大。您可以看到,當cassandra的system.log上出現此警告時,堆壓力增加,因此gc.log同時打印。

我該怎麼辦?

只有在數據所屬的表上定義的local_delete_time + gc_grace_seconds後,Cassandra纔會在壓縮觸發時完全丟棄這些墓碑。

閱讀更多:
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html https://opencredo.com/cassandra-tombstones-common-issues/

+0

感謝@Ashraful伊斯蘭教的答案。沒有時間來驗證這一點。讓我看看這些建議,並在這裏更新。 – Pankaj

相關問題