所以,我們使用G1 GC。和18GB堆。年輕一代的規模約爲3.5G。堆最大使用量約爲12G。記憶充滿了短暫的生物。G1 GC處理參考太慢
此外,也許重要的是Couchbase實例正在JVM附近的同一節點上運行。它在硬盤上保留更改時經常採用所有iops,但是有足夠的空閒cpu時間和內存。
啓用JVM選項:
-Xmx18g -Xms18g -XX:MaxPermSize=512M -XX:+UseG1GC -XX:+DisableExplicitGC -XX:+ParallelRefProcEnabled
我傷心的問題是年輕一代的GC暫停。長時間停頓GC日誌通常看起來像:
2013-07-10T15:06:25.963+0400: 9122,066: [GC pause (young)
Desired survivor size 243269632 bytes, new threshold 5 (max 15)
- age 1: 69789280 bytes, 69789280 total
- age 2: 58618240 bytes, 128407520 total
- age 3: 54519720 bytes, 182927240 total
- age 4: 51592728 bytes, 234519968 total
- age 5: 45186520 bytes, 279706488 total
9122,066: [G1Ergonomics (CSet Construction) start choosing CSet, predicted base time: 174,16 ms, remaining time: 25,84 ms, target pause time: 200,00 ms]
9122,066: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 426 regions, survivors: 34 regions, predicted young region time: 164,97 ms]
9122,066: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 426 regions, survivors: 34 regions, old: 0 regions, predicted pause time: 339,13 ms, target pause time: 200,00 ms]
9122,259: [SoftReference, 0 refs, 0,0063780 secs]9124,575: [WeakReference, 4 refs, 0,0031600 secs]9124,578: [FinalReference, 1640 refs, 0,0033730 secs]9124,581: [PhantomReference, 145 refs, 0,0032080 secs]9124,5
85: [JNI Weak Reference, 0,0000810 secs], 2,53669600 secs]
[Parallel Time: 190,5 ms]
[GC Worker Start (ms): 9122066,6 9122066,7 9122066,7 9122066,7 9122066,8 9122066,9 9122066,9 9122066,9 9122066,9 9122067,0 9122067,0 9122067,1 9122067,1 9122067,1 9122067,1 9122067,2 91220
67,2 9122067,3
Avg: 9122067,0, Min: 9122066,6, Max: 9122067,3, Diff: 0,7]
[Ext Root Scanning (ms): 4,7 6,0 4,8 4,5 4,2 4,3 4,2 4,3 4,6 3,4 13,5 5,2 4,2 5,6 4,2 4,1 4,3 4,0
Avg: 5,0, Min: 3,4, Max: 13,5, Diff: 10,1]
[Update RS (ms): 20,9 19,6 21,1 21,3 21,2 21,2 21,3 21,2 21,7 21,5 12,1 20,2 21,1 19,4 21,0 21,1 20,7 21,2
Avg: 20,4, Min: 12,1, Max: 21,7, Diff: 9,6]
[Processed Buffers : 27 23 25 29 31 22 25 34 28 14 36 23 24 22 28 24 25 24
Sum: 464, Avg: 25, Min: 14, Max: 36, Diff: 22]
[Scan RS (ms): 9,0 9,2 8,7 8,8 9,1 9,1 8,9 9,1 8,3 9,2 9,0 9,1 9,2 9,2 9,1 9,0 9,0 9,1
Avg: 9,0, Min: 8,3, Max: 9,2, Diff: 1,0]
[Object Copy (ms): 145,1 145,0 145,2 145,1 145,1 144,9 145,1 144,9 144,9 145,4 144,8 144,8 144,8 145,0 145,0 145,1 145,2 144,9
Avg: 145,0, Min: 144,8, Max: 145,4, Diff: 0,6]
[Termination (ms): 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0
Avg: 0,0, Min: 0,0, Max: 0,0, Diff: 0,0]
[Termination Attempts : 5 8 2 11 5 6 6 5 5 7 4 7 2 9 8 5 7 8
Sum: 110, Avg: 6, Min: 2, Max: 11, Diff: 9]
[GC Worker End (ms): 9122246,4 9122246,6 9122246,7 9122246,6 9122246,7 9122246,7 9122246,5 9122246,7 9122246,5 9122246,5 9122246,6 9122246,7 9122246,8 9122246,4 9122246,6 9122246,5 9122246
,7 9122246,8
Avg: 9122246,6, Min: 9122246,4, Max: 9122246,8, Diff: 0,3]
[GC Worker (ms): 179,8 179,9 180,0 179,8 179,9 179,9 179,6 179,8 179,6 179,5 179,6 179,6 179,7 179,3 179,5 179,4 179,4 179,5
Avg: 179,7, Min: 179,3, Max: 180,0, Diff: 0,7]
[GC Worker Other (ms): 10,7 10,7 10,8 10,8 10,9 10,9 11,0 11,0 11,0 11,1 11,1 11,1 11,2 11,2 11,2 11,2 11,3 11,3
Avg: 11,0, Min: 10,7, Max: 11,3, Diff: 0,6]
[Clear CT: 2,8 ms]
[Other: 2343,4 ms]
[Choose CSet: 0,1 ms]
[Ref Proc: 2327,7 ms]
[Ref Enq: 1,9 ms]
[Free CSet: 8,2 ms]
[Eden: 3408M(3408M)->0B(3400M) Survivors: 272M->280M Heap: 9998M(18432M)->6638M(18432M)]
[Times: user=3,26 sys=0,02, real=2,54 secs]
Total time for which application threads were stopped: 2,5434370 seconds
GC出現問題的唯一階段是'參考處理'。但日誌看起來很奇怪:軟,弱,final和jni引用處理花費的時間很少。但總體時間是2.5秒。或者甚至更糟糕,在最壞的情況下可以達到10秒。
另一個暫停(更舒適)可能看起來像:
2013-07-10T16:26:11.862+0400: 13907,965: [GC pause (young)
Desired survivor size 243269632 bytes, new threshold 4 (max 15)
- age 1: 69125832 bytes, 69125832 total
- age 2: 58756480 bytes, 127882312 total
- age 3: 52397376 bytes, 180279688 total
- age 4: 88850424 bytes, 269130112 total
13907,965: [G1Ergonomics (CSet Construction) start choosing CSet, predicted base time: 77,38 ms, remaining time: 122,62 ms, target pause time: 200,00 ms]
13907,965: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 427 regions, survivors: 33 regions, predicted young region time: 167,95 ms]
13907,965: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 427 regions, survivors: 33 regions, old: 0 regions, predicted pause time: 245,33 ms, target pause time: 200,00 ms]
13908,155: [SoftReference, 0 refs, 0,0041340 secs]13908,160: [WeakReference, 0 refs, 0,0023850 secs]13908,162: [FinalReference, 1393 refs, 0,0065970 secs]13908,169: [PhantomReference, 108 refs, 0,0018650 secs]13908,171: [JNI Weak Reference, 0,0000630 secs], 0,22008100 secs]
[Parallel Time: 188,4 ms]
[GC Worker Start (ms): 13907965,3 13907965,3 13907965,4 13907965,4 13907965,5 13907965,5 13907965,6 13907965,6 13907965,6 13907965,7 13907965,7 13907965,7 13907965,8 13907965,8 13907965,8 13907965,9 13907965,9 13907965,9
Avg: 13907965,6, Min: 13907965,3, Max: 13907965,9, Diff: 0,6]
[Ext Root Scanning (ms): 5,8 5,0 6,8 6,3 6,1 6,2 6,0 6,3 5,2 4,2 5,0 6,2 4,5 6,0 17,1 4,4 6,2 5,3
Avg: 6,3, Min: 4,2, Max: 17,1, Diff: 12,9]
[Update RS (ms): 24,8 26,0 23,9 24,1 24,1 24,1 24,2 23,9 25,0 25,2 25,1 24,1 26,0 24,3 13,7 25,7 24,2 24,7
Avg: 24,1, Min: 13,7, Max: 26,0, Diff: 12,2]
[Processed Buffers : 30 20 9 16 16 19 20 21 22 12 30 17 17 20 12 20 17 22
Sum: 340, Avg: 18, Min: 9, Max: 30, Diff: 21]
[Scan RS (ms): 7,5 7,1 7,2 7,5 7,6 7,5 7,5 7,6 7,1 7,4 7,6 7,2 7,2 7,4 7,2 7,5 7,0 7,7
Avg: 7,4, Min: 7,0, Max: 7,7, Diff: 0,7]
[Object Copy (ms): 133,1 133,1 133,2 133,1 133,2 133,1 133,2 133,1 133,5 134,0 133,0 133,2 133,0 132,9 132,6 133,1 133,2 132,9
Avg: 133,1, Min: 132,6, Max: 134,0, Diff: 1,3]
[Termination (ms): 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0
Avg: 0,0, Min: 0,0, Max: 0,0, Diff: 0,0]
[Termination Attempts : 1 3 1 1 1 1 1 1 1 1 2 3 2 1 1 1 1 1
Sum: 24, Avg: 1, Min: 1, Max: 3, Diff: 2]
[GC Worker End (ms): 13908136,6 13908136,9 13908136,5 13908136,7 13908136,7 13908136,8 13908136,7 13908136,7 13908136,8 13908136,8 13908136,5 13908136,6 13908136,5 13908136,5 13908136,5 13908136,5 13908136,8 13908136,6
Avg: 13908136,7, Min: 13908136,5, Max: 13908136,9, Diff: 0,4]
[GC Worker (ms): 171,3 171,6 171,1 171,2 171,2 171,3 171,1 171,1 171,1 171,2 170,8 170,9 170,7 170,7 170,7 170,6 171,0 170,7
Avg: 171,0, Min: 170,6, Max: 171,6, Diff: 0,9]
[GC Worker Other (ms): 17,2 17,2 17,3 17,3 17,4 17,4 17,5 17,5 17,5 17,5 17,6 17,6 17,7 17,7 17,7 17,7 17,8 17,8
Avg: 17,5, Min: 17,2, Max: 17,8, Diff: 0,6]
[Clear CT: 1,6 ms]
[Other: 30,1 ms]
[Choose CSet: 0,1 ms]
[Ref Proc: 17,1 ms]
[Ref Enq: 0,9 ms]
[Free CSet: 7,4 ms]
[Eden: 3416M(3416M)->0B(3456M) Survivors: 264M->224M Heap: 7289M(18432M)->3912M(18432M)]
[Times: user=3,16 sys=0,00, real=0,22 secs]
參考處理仍然是最長的階段,但它的短得多。 ParallelRefProcEnabled不能解決我的問題。我也試圖改變年輕的gen的大小。它也沒有幫助。設置不同的-XX:MaxGCPauseMillis,更寬鬆600ms或更嚴格100,導致吞吐量仍然不佳。
CMS性能甚至比G1與參數更糟:
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
楊根集合會更長,更頻繁。
我完全與描述的日誌混淆。明天我會嘗試將Couchbase實例移動到另一個節點,以檢查它是否凍結GC線程。但是,如果Couchbase不是重點,也許有人可以解釋我對日誌的含義。或者,也許有一些神奇的CMS參數來解決這個問題。
我會很高興的任何幫助!
您是否有CMS收集器的GC日誌,並啓用了+ XX:+ PrintReferenceGC選項? –
是的,當然。這部分的日誌是關於它的: 13908,155:[SoftReference,0 refs,0,0041340 secs] 13908,160:[WeakReference,0 refs,0.0023850 secs] 13908,162:[FinalReference,1393 refs, 0,0065970秒] 13908,169:[PhantomReference,108參考,0.0018650秒] 13908,171:[JNI弱參考,0,0000630秒],0,22008100秒] [並行時間:188,4毫秒] –
事實上,我們似乎已經通過將couchbase移動到另一臺機器來解決了我們的問題。現在99%的暫停時間少於0.25秒。但是,我們在每1,1-0秒的GC暫停中仍然有大約2000個最終鏈接。也許我應該做點什麼來讓它變得更好。 –