GLSL SpinLock only Mostly Works

我已經實現了使用GLSL自旋鎖的深度剝離算法（受this的啓發）。在下面的可視化中，請注意深度剝離算法的正確運行方式（第一層左上，第二層右上，第三層左下，第四層右下）。四個深度圖層存儲在一個RGBA紋理中。GLSL SpinLock only Mostly Works

不幸的是，自旋鎖有時不能防止錯誤 - 你可以看到很少的白色斑點，特別是在第四層。第二層的太空船也有一個。這些斑點每幀都有所不同。

enter image description here

以我GLSL自旋鎖，當一個片段是要繪製，所述片段程序讀取和原子寫鎖定值到一個單獨的鎖定紋理，等待直到一個0出現時，指示該鎖打開。 In practice，我發現程序必須是並行的，因爲如果兩個線程在同一像素上，則warp不能繼續（一個必須等待，另一個線程繼續，並且GPU線程扭曲中的所有線程必須同時執行）。

我的片斷程序看起來像這樣（註釋和補充間距）：

#version 420 core 

//locking texture 
layout(r32ui) coherent uniform uimage2D img2D_0; 
//data texture, also render target 
layout(RGBA32F) coherent uniform image2D img2D_1; 

//Inserts "new_data" into "data", a sorted list 
vec4 insert(vec4 data, float new_data) { 
    if  (new_data<data.x) return vec4(  new_data,data.xyz); 
    else if (new_data<data.y) return vec4(data.x,new_data,data.yz); 
    else if (new_data<data.z) return vec4(data.xy,new_data,data.z); 
    else if (new_data<data.w) return vec4(data.xyz,new_data  ); 
    else      return data; 
} 

void main() { 
    ivec2 coord = ivec2(gl_FragCoord.xy); 

    //The idea here is to keep looping over a pixel until a value is written. 
    //By looping over the entire logic, threads in the same warp aren't stalled 
    //by other waiting threads. The first imageAtomicExchange call sets the 
    //locking value to 1. If the locking value was already 1, then someone 
    //else has the lock, and can_write is false. If the locking value was 0, 
    //then the lock is free, and can_write is true. The depth is then read, 
    //the new value inserted, but only written if can_write is true (the 
    //locking texture was free). The second imageAtomicExchange call resets 
    //the lock back to 0. 

    bool have_written = false; 
    while (!have_written) { 
     bool can_write = (imageAtomicExchange(img2D_0,coord,1u) != 1u); 

     memoryBarrier(); 

     vec4 depths = imageLoad(img2D_1,coord); 
     depths = insert(depths,gl_FragCoord.z); 

     if (can_write) { 
      imageStore(img2D_1,coord,depths); 
      have_written = true; 
     } 

     memoryBarrier(); 

     imageAtomicExchange(img2D_0,coord,0); 

     memoryBarrier(); 
    } 
    discard; //Already wrote to render target with imageStore 
}

我的問題是，爲什麼會出現這種斑點的行爲呢？我想讓螺旋鎖在100％的時間內工作！它可能與我的memoryBrier（）的位置有關嗎？

來源

2012-08-05 imallett

「imageAtomicExchange（img2D_0，coord，0）;」需要在if語句中，因爲即使對於沒有它的線程，它也會重置鎖變量！改變它可以修復它。

來源

2013-01-31 05:48:55 imallett

最終片段着色器是什麼樣的？它是否還有memoryBarrier（）操作？ – ragnar 2013-02-12 18:55:05

是的，但在更簡潔的位置。 IIRC（它是程序生成的），它們僅在imageAtomicExchange和imageAtomicExchange之後。 – imallett 2013-02-12 21:35:06

我實際上錯誤地認爲它解決了這個問題。我在這裏做了一個更完整的列表：http://stackoverflow.com/questions/21538555/broken-glsl-spinlock-glsl-locks-compendium – imallett 2014-02-03 21:53:29

作爲參考，這裏是鎖定的代碼，已經測試在GTX670上的Nvidia驅動程序314.22 & 320.18上工作。請注意，如果將代碼重新排序或重寫爲邏輯上等效的代碼，則會觸發現有的編譯器優化錯誤（請參閱下面的註釋）。下面的註釋使用無圖像引用。

// sem is initialized to zero 
coherent uniform layout(size1x32) uimage2D sem; 

void main(void) 
{ 
    ivec2 coord = ivec2(gl_FragCoord.xy); 

    bool done = false; 
    uint locked = 0; 
    while(!done) 
    { 
    // locked = imageAtomicCompSwap(sem, coord, 0u, 1u); will NOT work 
     locked = imageAtomicExchange(sem, coord, 1u); 
     if (locked == 0) 
     { 
      performYourCriticalSection(); 

      memoryBarrier(); 

      imageAtomicExchange(sem, coord, 0u); 

      // replacing this with a break will NOT work 
      done = true; 
     } 
    } 

    discard; 
}

來源

2013-05-28 21:51:03

GLSL SpinLock only Mostly Works

回答

相關問題