自定義glBlendFunc比本機慢很多

我試圖通過片段着色器來做我自己的自定義glBlendFunc，但是，我的解決方案比原生glBlendFunc慢很多，即使他們做了精確的混合函數。自定義glBlendFunc比本機慢很多

我想知道如果有人有任何建議，如何以更有效的方式做到這一點。

我的解決方案的工作是這樣的：

void draw(fbo fbos[2], render_item item) 
{ 
    // fbos[0] is the render target 
    // fbos[1] is the previous render target used to read "background" to blend against in shader 
    // Both fbos have exactly the same content, however they need to be different since we can't both read and write to the same texture. The texture we render to needs to have the entire content since we might not draw geometry everywhere. 

    fbos[0]->attach(); // Attach fbo 
    fbos[1]->bind(1); // Bind as texture 1 

    render(item); 

    glCopyTexSubImage2D(...); // copy from fbos[0] to fbos[1], fbos[1] == fbos[0] 
}

fragment.glsl

vec4 blend_color(vec4 fore) 
{ 
    vec4 back = texture2D(background, gl_TexCoord[1].st); // background is read from texture "1" 
    return vec4(mix(back.rgb, fore.rgb, fore.a), back.a + fore.a); 
}

來源

2011-08-14 ronag

爲改善基於FBO的混合性能，最好的選擇是NV_texture_barrier。儘管有這個名字，AMD也已經實現了它，所以如果你堅持使用Radeon HD級卡，它應該對你有用。

基本上，它允許你乒乓球沒有重量級的操作，如FBO綁定或紋理附件操作。該規範有一個底部的部分，顯示了一般算法。

另一種選擇是EXT_shader_image_load_store。這將需要DX11/GL 4.x類硬件。 OpenGL 4.2最近將其推向核心ARB_shader_image_load_store。

即使這樣，達西說，你永遠不會打敗常規混合。它使用着色器無法訪問的特殊硬件結構（因爲它們在着色器運行後發生）。如果存在某種效果，您絕對無法以其他方式完成，您應該只進行編程混合。

來源

2011-08-14 03:00:04

NV_texture_barrier使我能夠同時渲染和讀取相同的紋理，正確嗎？ – ronag

@ronag：排序。最好是閱讀擴展規範，但一般的要點是，只要你不是從同一個地方做的，就可以讀取和寫入相同的紋理。並且你適當地使用屏障。 –

這是很多更有效，因爲混合操作是直接內置到GPU硬件，所以你可能AREN」無法以速度擊敗它。話雖如此，請確保您已進行深度測試，背面剔除，硬件混合以及任何其他不需要的操作。我不能說這會造成巨大的差異，但它可能會使一些。

來源

2011-08-14 02:44:20

自定義glBlendFunc比本機慢很多

回答

相關問題