是否有這個從同步方法調用同步方法的同步成本是多少?
synchronized void x() {
y();
}
synchronized void y() {
}
這
synchronized void x() {
y();
}
void y() {
}
是否有這個從同步方法調用同步方法的同步成本是多少?
synchronized void x() {
y();
}
synchronized void y() {
}
這
synchronized void x() {
y();
}
void y() {
}
是的,有一個額外的性能開銷,除非及直至JVM內聯調用y()
,其現代化的JIT之間的任何性能上的差異編譯器會以相當短的時間完成。首先,考慮你在課堂外可以看到y()
的情況。在這種情況下,JVM必須檢查輸入y()
以確保它可以進入對象的顯示器;當呼叫來自x()
時,此檢查將始終成功,但它不能被跳過,因爲呼叫可能來自班級以外的客戶。這個額外支票的成本很低。
另外,考慮其中y()
是private
的情況。在這種情況下,編譯器仍未優化掉同步;看到一個空y()
以下拆卸:
private synchronized void y();
flags: ACC_PRIVATE, ACC_SYNCHRONIZED
Code:
stack=0, locals=1, args_size=1
0: return
According to the spec's definition of synchronized
,每個進入一個塊或方法進行物體上鎖定動作,並留下執行解鎖動作。除非鎖定計數器下降到零,否則沒有其他線程可以獲取該對象的監視器。據推測,某種靜態分析可以證明private synchronized
方法只能從其他方法中調用,但是Java的多源文件支持最多隻會使這種脆弱,甚至忽略反射。關於其回報方法和顯示器出口的調用This means that the JVM must still increment the counter on entering y()
:
監控項,由Java虛擬機的方法調用隱式地處理和返回指令,彷彿monitorenter和monitorexit使用。
@AmolSonawane correctly notes該JVM可通過執行鎖粗,基本上內聯y()
方法優化運行時該代碼。在這種情況下,在JVM決定執行JIT優化之後,從x()
調用到y()
將不會引起任何額外的性能開銷,但當然從其他位置直接調用y()
仍然需要單獨獲取監視器。
在兩種方法同步的情況下,您將鎖定顯示器兩次。所以第一種方法會有額外的額外開銷鎖定。但是您的JVM可以通過鎖定粗化來降低鎖定的成本,並且可以在線調用y()。
如果您已經持有它,則不需要獲取該鎖... – assylias
這不是真的,如果兩個mehtods都是同步的並且是非靜態的,則不需要額外的鎖定是必須的。 –
「線程t可能會多次鎖定某個特定的監視器;每次解鎖都會反轉一次鎖定操作的效果。」 - Java語言規範17.1 –
爲什麼不測試它!?我跑了一個快速的基準。在迴路中調用benchmark()
方法進行預熱。這可能不是很準確,但確實顯示出一些一致的有趣模式。
public class Test {
public static void main(String[] args) {
for (int i = 0; i < 100; i++) {
System.out.println("+++++++++");
benchMark();
}
}
static void benchMark() {
Test t = new Test();
long start = System.nanoTime();
for (int i = 0; i < 100; i++) {
t.x();
}
System.out.println("Double sync:" + (System.nanoTime() - start)/1e6);
start = System.nanoTime();
for (int i = 0; i < 100; i++) {
t.x1();
}
System.out.println("Single sync:" + (System.nanoTime() - start)/1e6);
}
synchronized void x() {
y();
}
synchronized void y() {
}
synchronized void x1() {
y1();
}
void y1() {
}
}
結果(最後10)
+++++++++
Double sync:0.021686
Single sync:0.017861
+++++++++
Double sync:0.021447
Single sync:0.017929
+++++++++
Double sync:0.021608
Single sync:0.016563
+++++++++
Double sync:0.022007
Single sync:0.017681
+++++++++
Double sync:0.021454
Single sync:0.017684
+++++++++
Double sync:0.020821
Single sync:0.017776
+++++++++
Double sync:0.021107
Single sync:0.017662
+++++++++
Double sync:0.020832
Single sync:0.017982
+++++++++
Double sync:0.021001
Single sync:0.017615
+++++++++
Double sync:0.042347
Single sync:0.023859
貌似第二變化確實略微更快。
一個micro benchmark run with jmh
Benchmark Mean Mean error Units
c.a.p.SO18996783.syncOnce 21.003 0.091 nsec/op
c.a.p.SO18996783.syncTwice 20.937 0.108 nsec/op
=>無統計學差異的結果。
查看生成的程序集顯示已經執行了鎖定粗化,並且x_sync
中已經內嵌y_sync
,但它是同步的。
全部結果:
Benchmarks:
# Running: com.assylias.performance.SO18996783.syncOnce
Iteration 1 (5000ms in 1 thread): 21.049 nsec/op
Iteration 2 (5000ms in 1 thread): 21.052 nsec/op
Iteration 3 (5000ms in 1 thread): 20.959 nsec/op
Iteration 4 (5000ms in 1 thread): 20.977 nsec/op
Iteration 5 (5000ms in 1 thread): 20.977 nsec/op
Run result "syncOnce": 21.003 ±(95%) 0.055 ±(99%) 0.091 nsec/op
Run statistics "syncOnce": min = 20.959, avg = 21.003, max = 21.052, stdev = 0.044
Run confidence intervals "syncOnce": 95% [20.948, 21.058], 99% [20.912, 21.094]
Benchmarks:
com.assylias.performance.SO18996783.syncTwice
Iteration 1 (5000ms in 1 thread): 21.006 nsec/op
Iteration 2 (5000ms in 1 thread): 20.954 nsec/op
Iteration 3 (5000ms in 1 thread): 20.953 nsec/op
Iteration 4 (5000ms in 1 thread): 20.869 nsec/op
Iteration 5 (5000ms in 1 thread): 20.903 nsec/op
Run result "syncTwice": 20.937 ±(95%) 0.065 ±(99%) 0.108 nsec/op
Run statistics "syncTwice": min = 20.869, avg = 20.937, max = 21.006, stdev = 0.052
Run confidence intervals "syncTwice": 95% [20.872, 21.002], 99% [20.829, 21.045]
無差異會在那裏。由於線程內容只能在x()獲取鎖定。在x()處獲得鎖的線程可以在y()處獲取鎖而不會發生任何爭用(因爲那只是在特定時間可以到達該點的線程)。因此,在那裏放置同步沒有任何影響。
測試可以發現如下(你必須猜測一些方法做,但沒有什麼複雜的):
它測試他們每100個線程並開始計數的平均值,其中70%已經完成後(如預熱) 。
它在最後打印一次。
public static final class Test {
final int iterations = 100;
final int jiterations = 1000000;
final int count = (int) (0.7 * iterations);
final AtomicInteger finishedSingle = new AtomicInteger(iterations);
final AtomicInteger finishedZynced = new AtomicInteger(iterations);
final MovingAverage.Cumulative singleCum = new MovingAverage.Cumulative();
final MovingAverage.Cumulative zyncedCum = new MovingAverage.Cumulative();
final MovingAverage singleConv = new MovingAverage.Converging(0.5);
final MovingAverage zyncedConv = new MovingAverage.Converging(0.5);
// -----------------------------------------------------------
// -----------------------------------------------------------
public static void main(String[] args) {
final Test test = new Test();
for (int i = 0; i < test.iterations; i++) {
test.benchmark(i);
}
Threads.sleep(1000000);
}
// -----------------------------------------------------------
// -----------------------------------------------------------
void benchmark(int i) {
Threads.async(()->{
long start = System.nanoTime();
for (int j = 0; j < jiterations; j++) {
a();
}
long elapsed = System.nanoTime() - start;
int v = this.finishedSingle.decrementAndGet();
if (v <= count) {
singleCum.add (elapsed);
singleConv.add(elapsed);
}
if (v == 0) {
System.out.println(elapsed);
System.out.println("Single Cum:\t\t" + singleCum.val());
System.out.println("Single Conv:\t" + singleConv.val());
System.out.println();
}
});
Threads.async(()->{
long start = System.nanoTime();
for (int j = 0; j < jiterations; j++) {
az();
}
long elapsed = System.nanoTime() - start;
int v = this.finishedZynced.decrementAndGet();
if (v <= count) {
zyncedCum.add(elapsed);
zyncedConv.add(elapsed);
}
if (v == 0) {
// Just to avoid the output not overlapping with the one above
Threads.sleep(500);
System.out.println();
System.out.println("Zynced Cum: \t" + zyncedCum.val());
System.out.println("Zynced Conv:\t" + zyncedConv.val());
System.out.println();
}
});
}
synchronized void a() { b(); }
void b() { c(); }
void c() { d(); }
void d() { e(); }
void e() { f(); }
void f() { g(); }
void g() { h(); }
void h() { i(); }
void i() { }
synchronized void az() { bz(); }
synchronized void bz() { cz(); }
synchronized void cz() { dz(); }
synchronized void dz() { ez(); }
synchronized void ez() { fz(); }
synchronized void fz() { gz(); }
synchronized void gz() { hz(); }
synchronized void hz() { iz(); }
synchronized void iz() {}
}
基本上(原子方式執行)MovingAverage.Cumulative添加: 平均=(平均*(N)+數)/(++ N);
MovingAverage.Converging可以查找但使用另一個公式。
50秒預熱後的結果:
隨着:jiterations - >百萬
Zynced Cum: 3.2017985649516254E11
Zynced Conv: 8.11945143126507E10
Single Cum: 4.747368153507841E11
Single Conv: 8.277793176290959E10
這就是納米秒平均值。這真的沒什麼,甚至表明,zynced人需要更少的時間。
附:jiterations - >原* 10(需要更長的時間)
Zynced Cum: 7.462005651190714E11
Zynced Conv: 9.03751742946726E11
Single Cum: 9.088230941676143E11
Single Conv: 9.09877020004914E11
正如你所看到的結果表明,它真的不是一個很大的區別。實際上已經有較低的最後30%完成的平均時間。
每一個線程(迭代= 1)和jiterations =原始* 100;
Zynced Cum: 6.9167088486E10
Zynced Conv: 6.9167088486E10
Single Cum: 6.9814404337E10
Single Conv: 6.9814404337E10
在同一線程環境(除去Threads.async調用)
附:jiterations - >原* 10
Single Cum: 2.940499529542545E8
Single Conv: 5.0342450600964054E7
Zynced Cum: 1.1930525617915475E9
Zynced Conv: 6.672312498662484E8
的zynced一個在這裏似乎要慢。大約爲10。其原因可能是由於每次之後都會運行一次zynced,誰知道。沒有能量去嘗試相反的。
最後一次測試有:
public static final class Test {
final int iterations = 100;
final int jiterations = 10000000;
final int count = (int) (0.7 * iterations);
final AtomicInteger finishedSingle = new AtomicInteger(iterations);
final AtomicInteger finishedZynced = new AtomicInteger(iterations);
final MovingAverage.Cumulative singleCum = new MovingAverage.Cumulative();
final MovingAverage.Cumulative zyncedCum = new MovingAverage.Cumulative();
final MovingAverage singleConv = new MovingAverage.Converging(0.5);
final MovingAverage zyncedConv = new MovingAverage.Converging(0.5);
// -----------------------------------------------------------
// -----------------------------------------------------------
public static void main(String[] args) {
final Test test = new Test();
for (int i = 0; i < test.iterations; i++) {
test.benchmark(i);
}
Threads.sleep(1000000);
}
// -----------------------------------------------------------
// -----------------------------------------------------------
void benchmark(int i) {
long start = System.nanoTime();
for (int j = 0; j < jiterations; j++) {
a();
}
long elapsed = System.nanoTime() - start;
int s = this.finishedSingle.decrementAndGet();
if (s <= count) {
singleCum.add (elapsed);
singleConv.add(elapsed);
}
if (s == 0) {
System.out.println(elapsed);
System.out.println("Single Cum:\t\t" + singleCum.val());
System.out.println("Single Conv:\t" + singleConv.val());
System.out.println();
}
long zstart = System.nanoTime();
for (int j = 0; j < jiterations; j++) {
az();
}
long elapzed = System.nanoTime() - zstart;
int z = this.finishedZynced.decrementAndGet();
if (z <= count) {
zyncedCum.add(elapzed);
zyncedConv.add(elapzed);
}
if (z == 0) {
// Just to avoid the output not overlapping with the one above
Threads.sleep(500);
System.out.println();
System.out.println("Zynced Cum: \t" + zyncedCum.val());
System.out.println("Zynced Conv:\t" + zyncedConv.val());
System.out.println();
}
}
synchronized void a() { b(); }
void b() { c(); }
void c() { d(); }
void d() { e(); }
void e() { f(); }
void f() { g(); }
void g() { h(); }
void h() { i(); }
void i() { }
synchronized void az() { bz(); }
synchronized void bz() { cz(); }
synchronized void cz() { dz(); }
synchronized void dz() { ez(); }
synchronized void ez() { fz(); }
synchronized void fz() { gz(); }
synchronized void gz() { hz(); }
synchronized void hz() { iz(); }
synchronized void iz() {}
}
結論,實在是沒有什麼區別。
如果有區別,我會感到驚訝。另請參閱http://www.oracle.com/technetwork/java/6-performance-137236.html(2.1.1和2.1.2) – assylias