我試圖並行化的程序,做一些圖像處理與OpenACC的。作爲該處理的一部分我有定義類似於自定義結構:手冊深複製到設備用C
typedef struct {
RGB *image;
double property;
} Deep;
哪我的陣列Deep *structPointer
內訪問。
我所遇到的一些文件進行手動複製的structPointer
的全部內容到GPU,這給我留下了下面的代碼。
Deep *structPointer = (Deep*)
malloc(total_size*sizeof(Deep));
assert(structPointer);
int i;
for (i = 0; i < total_size; i++)
{
structPointer[i].image = randomImage(width, height, max);
}
dP = acc_copyin(stuctPointer, sizeof(Deep)*total_size);
for (i=0; i < total_size; i++) {
dA = acc_copyin(structPointer[i].image, sizeof(RGB)*width*height); //device address in dA
acc_memcpy_to_device(&dP[i].image, &dA, sizeof(RGB*));
}
這一切都運行良好,直到我嘗試運行for循環訪問structPointer
和修改的基礎上RGB *image
內容的陣列的構件的property
屬性的平行。
僞代碼:
#pragma acc parallel loop copyin(inputImage[0:width*height], width, height)
for (i = 0; i < total_size; i++) {
computeProperty(input_image, structPointer+i, width, height)
}
inline void compProperty (const RGB *A, Deep *B, int width, int height)
{
B->property = 10;
}
我得到:
調用的cuStreamSynchronize返回錯誤700:非法地址 內核執行期間
的cuda-memcheck
輸出是:
> ========= CUDA-MEMCHECK image2.ppm is a PPM file 256 x 256 image, max value= 255
> ========= Program hit CUDA_ERROR_INVALID_CONTEXT (error 201) due to "invalid device context" on CUDA API call to cuCtxAttach.
> ========= Saved host backtrace up to driver entry point at error
> ========= Host Frame:/usr/lib64/libcuda.so (cuCtxAttach + 0x156) [0x13fc36]
> ========= Host Frame:./genimg_acc [0x13639]
> =========
> ========= Program hit CUDA_ERROR_ILLEGAL_ADDRESS (error 700) due to "an illegal memory access was encountered" on CUDA API call to
> cuStreamSynchronize. call to cuStreamSynchronize returned error 700:
> Illegal address during kernel execution
> ========= Saved host backtrace up to driver entry point at error
> ========= Host Frame:/usr/lib64/libcuda.so (cuStreamSynchronize + 0x13d) [0x149a9d]
> ========= Host Frame:./genimg_acc [0x15856]
> =========
> ========= Program hit CUDA_ERROR_ILLEGAL_ADDRESS (error 700) due to "an illegal memory access was encountered" on CUDA API call to
> cuCtxSynchronize.
> ========= Saved host backtrace up to driver entry point at error
> ========= Host Frame:/usr/lib64/libcuda.so (cuCtxSynchronize + 0x127) [0x13ee37]
注意,當沒有OpenACC的編譯和在一個單獨的線程中運行時會正確處理的程序運行。
你真的想在'f'的賦值中使用'&inputImage [j]'嗎?你不想要'inputImage [j]'的值而不是地址嗎? –
這只是僞代碼而不是粘貼整個過程。如果程序能夠幫助我擴展這一點,但基本上這就是它正在做的事情。 – challett
你可以嘗試粘貼。你的程序可能會破壞某些東西。 –