Kcachegrind/callgrind對調度程序功能不正確？

我有一個模型代碼，其中kcachegrind/callgrind報告奇怪的結果。這是一種調度功能。調度員從4個地方調用;每個呼叫說，實際do_J函數來運行，其中（所以first2將調用僅do_1和do_2等）Kcachegrind/callgrind對調度程序功能不正確？

源（這是實際的代碼的模型）

#define N 1000000 

int a[N]; 
int do_1(int *a) { int i; for(i=0;i<N/4;i++) a[i]+=1; } 
int do_2(int *a) { int i; for(i=0;i<N/2;i++) a[i]+=2; } 
int do_3(int *a) { int i; for(i=0;i<N*3/4;i++) a[i]+=3; } 
int do_4(int *a) { int i; for(i=0;i<N;i++) a[i]+=4; } 

int dispatcher(int *a, int j) { 
    if(j==1) do_1(a); 
    else if(j==2) do_2(a); 
    else if(j==3) do_3(a); 
    else do_4(a); 
} 

int first2(int *a) { dispatcher(a,1); dispatcher(a,2); } 
int last2(int *a) { dispatcher(a,4); dispatcher(a,3); } 
int inner2(int *a) { dispatcher(a,2); dispatcher(a,3); } 
int outer2(int *a) { dispatcher(a,1); dispatcher(a,4); } 

int main(){ 
    first2(a); 
    last2(a); 
    inner2(a); 
    outer2(a); 
}

編譯時gcc -O0;與valgrind --tool=callgrind聯繫電話;與kcachegrind和qcachegrind-0.7 kcachegrinded。

這裏是應用程序的完整調用圖。到do_J所有路徑經過調度，這是良好（do_1只是牆根的太快了，但它是在這裏真的，只是留給DO_2）

Full

讓我們專注於do_1和檢查，誰被稱爲它（這張照片是不正確的）：

enter image description here

這是很奇怪的，我認爲，只有first2和outer2稱爲do_1但不是全部。

這是callgrind/kcachegrind的限制嗎？我怎樣才能得到準確的權重（與每個功能的運行時間成比例，有沒有孩子）？

來源

2011-09-20 osgx

是的，這是callgrind格式的限制。它不存儲完整的跟蹤;它只存儲親子通話信息。

有一個谷歌perftools項目與pprof/libprofiler.so CPU剖析器，http://google-perftools.googlecode.com/svn/trunk/doc/cpuprofile.html。 libprofiler.so可以使用calltraces獲取配置文件，並且它將存儲具有完全回溯的每個跟蹤事件。 pprof是libprofile輸出到圖形格式或callgrind格式的轉換器。在完整視圖中，結果將與kcachegrind中的結果相同;但如果您將重點關注某些功能，例如do_1使用pprof的選項焦點;它將關注功能時顯示準確的calltree。

來源

2011-10-30 09:52:20 osgx

有類似的測試與這裏描述的相同問題：[http://www.yosefk.com/blog/how-profilers-lie-the-cases-of-gprof-and-kcachegrind.html](http:/ /www.yosefk.com/blog/how-profilers-lie-the-cases-of-gprof-and-kcachegrind.html）「這是我們將看到的：......這些信息不足以知道呼叫的內容樹需要知道顯示真相。「並且有一個解決方法 - callgrind的'--separate-callers = N'選項來記錄callstack的N個槽位。 – osgx

Valgrind文檔有一個有用的選項'--separate-callers = N'，[記錄爲callgrind]（http：// valgrind.org/docs/manual/cl-manual.html）http://valgrind.org/docs/manual/cl-manual.html#cl-manual.cycles（6.2.4。避免週期）和http：///valgrind.org/docs/manual/cl-manual.html#cl-manual.options.separation（6.3.4。成本實體分離選項） – osgx

Kcachegrind/callgrind對調度程序功能不正確？

回答

相關問題