我們偶然發現了Quartz Events的性能問題,更具體地說是CGEventPost:在GPU負載過重時CGEventPost可能會阻塞。我們創建了a small benchmark application to demonstrate the issue。這個應用程序只是一個創建,發佈和發佈事件的循環。在GPU負載下CGEventPost的性能弱
您可以在下面看到運行應用程序的結果。第一次運行在空閒系統上。第二輪是FurMark(GPU壓力測試),儘量使用轉盤。
- 內部是內部循環需要多長時間,基本上只是用Quartz Events創建,發佈和發佈一個事件。
- 外部是我們的程序等待被喚醒多久(睡眠)。應該接近我們的睡眠時間,但如果系統處於壓力之下,可能會延遲。
- 帖子是事件發佈需要多長時間。
18:58:01.683 EventPerformance[4946:707] Measurements: (outer should be close to 10)
18:58:01.684 EventPerformance[4946:707] inner (ms): 0.04, outer (ms): 11.02, CGEventPost (ms): 0.03
18:58:01.684 EventPerformance[4946:707] inner (ms): 0.04, outer (ms): 11.02, CGEventPost (ms): 0.03
18:58:01.685 EventPerformance[4946:707] inner (ms): 0.07, outer (ms): 10.26, CGEventPost (ms): 0.03
18:58:01.685 EventPerformance[4946:707] inner (ms): 0.06, outer (ms): 10.85, CGEventPost (ms): 0.05
18:58:01.686 EventPerformance[4946:707] inner (ms): 0.07, outer (ms): 10.41, CGEventPost (ms): 0.04
18:58:01.686 EventPerformance[4946:707] inner (ms): 0.04, outer (ms): 10.39, CGEventPost (ms): 0.03
18:58:01.686 EventPerformance[4946:707] inner (ms): 0.05, outer (ms): 11.02, CGEventPost (ms): 0.03
18:58:01.687 EventPerformance[4946:707] inner (ms): 0.03, outer (ms): 10.67, CGEventPost (ms): 0.03
18:58:01.687 EventPerformance[4946:707] inner (ms): 0.08, outer (ms): 10.09, CGEventPost (ms): 0.05
18:58:01.688 EventPerformance[4946:707] Averages: (outer should be close to 10)
18:58:01.688 EventPerformance[4946:707] avg inner (ms): 0.05, avg outer (ms): 10.64, avg post (ms): 0.03
在這裏我們可以看到,張貼的事件平均需要約0.03毫秒。而且這個線程似乎太遲了0.5毫秒左右就被喚醒了。 CGEventPost沒有尖峯。
19:02:02.150 EventPerformance[5241:707] Measurements: (outer should be close to 10)
19:02:02.151 EventPerformance[5241:707] inner (ms): 0.03, outer (ms): 10.23, CGEventPost (ms): 0.02
19:02:02.151 EventPerformance[5241:707] inner (ms): 0.02, outer (ms): 10.54, CGEventPost (ms): 0.02
19:02:02.151 EventPerformance[5241:707] inner (ms): 0.02, outer (ms): 11.01, CGEventPost (ms): 0.01
19:02:02.152 EventPerformance[5241:707] inner (ms): 0.02, outer (ms): 10.74, CGEventPost (ms): 0.01
19:02:02.152 EventPerformance[5241:707] inner (ms): 0.02, outer (ms): 10.20, CGEventPost (ms): 0.01
19:02:02.152 EventPerformance[5241:707] inner (ms): 10.35, outer (ms): 11.01, CGEventPost (ms): 10.35
19:02:02.152 EventPerformance[5241:707] inner (ms): 0.03, outer (ms): 10.02, CGEventPost (ms): 0.02
19:02:02.153 EventPerformance[5241:707] inner (ms): 58.90, outer (ms): 10.11, CGEventPost (ms): 58.90
19:02:02.153 EventPerformance[5241:707] inner (ms): 0.03, outer (ms): 10.12, CGEventPost (ms): 0.02
19:02:02.153 EventPerformance[5241:707] Averages: (outer should be close to 10)
19:02:02.371 EventPerformance[5241:707] avg inner (ms): 7.71, avg outer (ms): 10.44, avg post (ms): 7.71
當系統處於沉重的GPU負載下時,發佈事件可能需要(尖峯)毫秒而不是微秒。在極端的GPU壓力下(< 1 FPS),此值可能需要幾秒鐘。 CGEventPost 有時似乎在等待GPU返回前完成一些工作。我們的線程仍然正常計劃,沒有明顯的延遲/尖峯(外部)。
任何想法表示讚賞。
在提交大量OpenCL工作時,我發現窗口中的效果甚至更糟(我知道你在osx上):所有GPU繪圖都變得越來越慢。就好像GPU沒有優先的概念或者波陣面的年代。 – doug65536
嘗試使用工具進行分析以查看發生減速的位置。 – monoxygen
你在電腦上運行過哪種類型的圖形卡?集成的,集成的+離散的或離散的?看看在具有不同顯卡配置的系統上是否運行不同的行爲會很有趣。 – monoxygen