2016-05-17 51 views
0

我編寫了如下簡單代碼,以檢查GPU是否可以執行一些計算工作。使用金屬時無法從gpu獲取數據

id<MTLDevice> device = MTLCreateSystemDefaultDevice(); 
NSLog(@"Device: %@", [device name]); 

id<MTLCommandQueue> commandQueue = [device newCommandQueue]; 

NSError * ns_error = nil; 
id<MTLLibrary>defaultLibrary = [device newLibraryWithFile:@"/Users/i/tmp/tmp6/s.metallib" error:&ns_error]; 

// Buffer for storing encoded commands that are sent to GPU 
id<MTLCommandBuffer> commandBuffer = [commandQueue commandBuffer]; 

// Encoder for GPU commands 
id <MTLComputeCommandEncoder> computeCommandEncoder = [commandBuffer computeCommandEncoder]; 

//set input and output data 
float tmpbuf[1000]; 
float outbuf[1000]; 
for(int i = 0; i < 1000; i++) 
{ 
    tmpbuf[i] = i; 
    outbuf[i] = 0; 
} 

int tmp_length = 100*sizeof(float); 
id<MTLBuffer> inVectorBuffer = [device newBufferWithBytes: tmpbuf length: tmp_length options: MTLResourceOptionCPUCacheModeDefault ]; 
[computeCommandEncoder setBuffer: inVectorBuffer offset: 0 atIndex: 0 ]; 
id<MTLBuffer> outVectorBuffer = [device newBufferWithBytes: outbuf length: tmp_length options: MTLResourceOptionCPUCacheModeDefault ]; 
[computeCommandEncoder setBuffer: outVectorBuffer offset: 0 atIndex: 1 ]; 


//get fuction 
id<MTLFunction> newfunc = [ defaultLibrary newFunctionWithName:@"sigmoid" ]; 

//get pipelinestat 
id<MTLComputePipelineState> cpipeline = [device newComputePipelineStateWithFunction: newfunc error:&ns_error ]; 

[computeCommandEncoder setComputePipelineState:cpipeline ]; 

// 
MTLSize ts= {10, 10, 1}; 
MTLSize numThreadgroups = {2, 5, 1}; 
[computeCommandEncoder dispatchThreadgroups:numThreadgroups threadsPerThreadgroup:ts]; 
[ computeCommandEncoder endEncoding ]; 
[ commandBuffer commit]; 

//get data computed by GPU 
NSData* outdata = [NSData dataWithBytesNoCopy:[outVectorBuffer contents ] length: tmp_length freeWhenDone:false ]; 
float final_out[1000]; 
[outdata getBytes:final_out length:tmp_length]; 

//In my option, each value of final_out should be 0 
for(int i = 0; i < 1000; i++) 
{ 
    printf("%.2f : %.2f\n", tmpbuf[i], final_out[i]); 
} 

着色器文件,名s.shader,是如下,它與值10.0

using namespace metal; 
kernel void sigmoid(const device float *inVector [[ buffer(0) ]], 
       device float *outVector [[ buffer(1) ]], 
       uint id [[ thread_position_in_grid ]]) { 
    // This calculates sigmoid for _one_ position (=id) in a vector per call on the GPU 
    outVector[id] = 10.0; 
} 

分配輸出在上述編碼,我通過可變final_out由計算GPU數據。在我的選擇中,final_out的每個值應該是10.0,如s.shader中所示。但是,final_out的所有值都是0.從GPU獲取數據的任何問題? 謝謝。

回答

2

提交命令緩衝區只是告訴驅動程序開始執行它。如果要讀取GPU上GPU操作的結果,則需要使用-waitUntilCompleted來阻止當前線程,或者在命令緩衝區以-addCompletedHandler:方法完成時添加要調用的塊。

另外一個注意事項:它看起來像你使用的存儲模式爲Shared的緩衝區。如果您曾經使用存儲模式爲Managed的緩衝區,則還需要創建blit命令編碼器並使用適當的緩衝區調用synchronizeResource:,然後等待其完成,如上所述,以便複製從GPU返回結果。

+0

謝謝warrenm,我會改善你的代碼。 – Pony

+0

問題依然存在。我添加了「[commandBuffer waitUntilCompleted];」在commandBuffer提交之後,我沒有使用帶有託管存儲模式的緩衝區。但是,從GPU取回數據失敗。 – Pony

+0

您正在調度2D網格,但'thread_position_in_grid'參數爲1D。這是一個無效的配置,如果您在啓用驗證層的情況下運行,該配置應該斷言斷言。 – warrenm