我有類似的問題 - 沒有優化,編譯失敗用完寄存器,並與優化,它花了近半個小時。我的內核有這樣的表達式
t1itern[II(i,j)] = (1.0 - overr) * t1itero[II(i,j)] + overr * (rhs[IJ(i-1,j-1)].rhs1 - abiter[IJ(i-1,j-1)].as * t1itern[II(i,j - 1)] - abiter[IJ(i-1,j-1)].ase * t1itero[II(i + 1,j - 1)] - abiter[IJ(i-1,j-1)].ae * t1itern[II(i + 1,j)] - abiter[IJ(i-1,j-1)].ane * t1itero[II(i + 1,j + 1)] - abiter[IJ(i-1,j-1)].an * t1itern[II(i,j + 1)] - abiter[IJ(i-1,j-1)].anw * t1itero[II(i - 1,j + 1)] - abiter[IJ(i-1,j-1)].aw * t1itern[II(i - 1,j)] - abiter[IJ(i-1,j-1)].asw * t1itero[II(i - 1,j - 1)] - rhs[IJ(i-1,j-1)].aads * t2itern[II(i,j - 1)] - rhs[IJ(i-1,j-1)].aadn * t2itern[II(i,j + 1)] - rhs[IJ(i-1,j-1)].aade * t2itern[II(i + 1,j)] - rhs[IJ(i-1,j-1)].aadw * t2itern[II(i - 1,j)] - rhs[IJ(i-1,j-1)].aadc * t2itero[II(i,j)])/abiter[IJ(i-1,j-1)].ac;
,當我改寫了他們:
tt1 = lrhs.rhs1;
tt1 = tt1 - labiter.as * t1itern[II(1,j - 1)];
tt1 = tt1 - labiter.ase * t1itern[II(2,j - 1)];
tt1 = tt1 - labiter.ae * t1itern[II(2,j)];
//etc
它顯著降低編譯時間和註冊使用。
也許是一個錯誤的編譯器?編譯器是否使用了大量內存並導致系統崩潰? – 2009-10-21 22:07:08
鑑於問題的本質,我不會感到驚訝。尤其是當我用--device-emulation編譯時,它會很快編譯。當然,即使它是編譯器中的一個bug,我仍然希望能夠做些什麼。 – rck 2009-10-23 18:14:36
如果您禁用優化,會發生什麼情況? – 2009-10-26 14:53:45