-3
我正在做一些實時的東西,我需要很多速度。但在我的代碼,我有這樣的:C++優化
float maxdepth;
uint32_t faceindex;
for (uint32_t tr_iterator = 0; tr_iterator < facesNum-1; tr_iterator++)
{
maxdepth = VXTrisDepth[tr_iterator];
faceindex = tr_iterator;
uint32_t tr_literator = 3*tr_iterator;
uint32_t facelindex = 3*faceindex;
for (uint32_t tr_titerator = tr_iterator+1; tr_titerator < facesNum; tr_titerator++)
{
float depth = VXTrisDepth[tr_titerator];
if (depth > maxdepth)
{
maxdepth = depth;
faceindex = tr_titerator;
}
}
Vei2 itmpx = trs[tr_literator+0];
trs[tr_literator+0] = trs[facelindex+0];
trs[facelindex+0] = itmpx;
itmpx = trs[tr_literator+1];
trs[tr_literator+1] = trs[facelindex+1];
trs[facelindex+1] = itmpx;
itmpx = trs[tr_literator+2];
trs[tr_literator+2] = trs[facelindex+2];
trs[facelindex+2] = itmpx;
float id = VXTrisDepth[tr_iterator];
VXTrisDepth[tr_iterator] = VXTrisDepth[faceindex];
VXTrisDepth[faceindex] = id;
}
VXTrisDepth只是浮動的數組,faceindex是一個uint32_t的,是一個很大的數字,TRS是Vei2的數組,Vei2僅僅是一個整數二維矢量。 問題是,當我們在facenum中有類似16074的東西時,這個循環需要700毫秒才能在我的計算機上運行,而且這太方便了,有沒有優化的想法?
你嘗試過'-O3'開關嗎? –
嘗試在你有tmp變量的地方使用std :: swap – JLev
可能的優化是將第二個循環移出第一個循環,「2nd」循環爲每個tr_titerator構建一個maxdepth和faceindex矢量, 1st循環使用它來代替。 – megabyte1024