彩色圖像使用CUDA並行處理灰度圖像

我想解決一個問題，我應該將彩色圖像更改爲灰度圖像。爲此，我使用CUDA並行方法。

我在GPU上調用的kerne代碼如下。彩色圖像使用CUDA並行處理灰度圖像

__global__ 
void rgba_to_greyscale(const uchar4* const rgbaImage, 
        unsigned char* const greyImage, 
        int numRows, int numCols) 
{ 
    int absolute_image_position_x = blockIdx.x; 
    int absolute_image_position_y = blockIdx.y; 

    if (absolute_image_position_x >= numCols || 
    absolute_image_position_y >= numRows) 
{ 
    return; 
} 
uchar4 rgba = rgbaImage[absolute_image_position_x + absolute_image_position_y]; 
float channelSum = .299f * rgba.x + .587f * rgba.y + .114f * rgba.z; 
greyImage[absolute_image_position_x + absolute_image_position_y] = channelSum; 

} 

void your_rgba_to_greyscale(const uchar4 * const h_rgbaImage, 
          uchar4 * const d_rgbaImage, 
          unsigned char* const d_greyImage, 
          size_t numRows, 
          size_t numCols) 
{ 
    //You must fill in the correct sizes for the blockSize and gridSize 
    //currently only one block with one thread is being launched 
    const dim3 blockSize(numCols/32, numCols/32 , 1); //TODO 
    const dim3 gridSize(numRows/12, numRows/12 , 1); //TODO 
    rgba_to_greyscale<<<gridSize, blockSize>>>(d_rgbaImage, 
              d_greyImage, 
              numRows, 
              numCols); 

    cudaDeviceSynchronize(); checkCudaErrors(cudaGetLastError()); 
}

我看到第一個像素行中的點線。

錯誤我得到是

libdc1394錯誤：無法在POS 51初始化libdc1394
差超過公差5
參考：255
GPU：0
my input/output images 誰能幫我這個???提前致謝。

來源

2013-02-05 Ashish Singh

請給你的問題一個更有意義的標題。就目前而言，除了你之外，對任何人都絕對沒有意義。具有類似圖像處理問題的人如何通過搜索找到這個問題？ – talonmies

@talonmies：希望標題現在有意義。 –

這是來自Udacity的「並行編程入門」課程的任務。你應該自己解決，而不是使用stackowerflow來解決你的問題。 – RoBiK

現在，因爲我發佈了這個問題，我一直在這個問題上不斷努力
有一些改進，應該完成，以便現在正確地解決這個問題我意識到我的初始解決方案錯了。做
的變化： -

1. absolute_position_x =(blockIdx.x * blockDim.x) + threadIdx.x; 
2. absolute_position_y = (blockIdx.y * blockDim.y) + threadIdx.y;

其次，

1. const dim3 blockSize(24, 24, 1); 
2. const dim3 gridSize((numCols/16), (numRows/16) , 1);

在上述方案中，我們使用數numCols網格/ 16 *數numCols/16
和塊大小爲24 * 24

代碼在0.040576毫秒內執行

@datenwolf：thanks for answ ering above！

來源

2013-02-06 04:03:33

任何想法爲什麼blockSize需要'24,24'和gridSize'numCols/16，numRows/16'？爲什麼有這個原因？其他號碼可以工作嗎？ – alvas

libdc1394 error: Failed to initialize libdc1394

我不認爲這是一個CUDA問題。 libdc1394是一個庫，用於訪問IEEE1394又名FireWire又名iLink視頻設備（DV攝像機，Apple iSight攝像頭）。該庫沒有正確初始化，因此你沒有得到有用的結果。基本上它是NINO：Nonsens In Nonsens Out。

來源

2013-02-05 16:07:41 datenwolf

@datewolf請看我添加了一個輸入/輸出圖像輸出的鏈接，我越來越。 –

我看到的是在pos 51錯誤超過了tolernace 5，所以我猜測它是否與顏色模式有關而不是任何其他鏈接器類型錯誤。 –

@ ashish173：這不是一個鏈接器問題，它是一個運行時問題。 dc1394庫無法在程序啓動時正確初始化，並且在用於檢索圖片時可能只會產生垃圾。您必須先解決初始化問題（這是一個運行時間的事情，即您必須編碼的東西）。 – datenwolf

您正在運行以下的塊和網格的數量：

const dim3 blockSize(numCols/32, numCols/32 , 1); //TODO 
    const dim3 gridSize(numRows/12, numRows/12 , 1); //TODO

但你是不是在你的內核代碼使用的線程！

int absolute_image_position_x = blockIdx.x; 
int absolute_image_position_y = blockIdx.y;

認爲這種方式，圖像的寬度可以劃分成absolute_image_position_x零件列的和的圖像的高度可以分成absolute_image_position_y部分行的。現在，每個創建的橫截面框都需要根據greyImage並行地更改/重繪所有像素。一個作業:)足夠的擾流板

來源

2013-02-06 00:20:54 sadaf2605

爲了回答我的想法，我沒有使用任何線程，這是我的愚蠢。 –

絕對計算x & y圖像位置是完美的。但是當你需要訪問彩色圖像中的特定像素時，你不應該使用下面的代碼嗎？

uchar4 rgba = rgbaImage[absolute_image_position_x + (absolute_image_position_y * numCols)];

我是這麼認爲的，相對於一個代碼時，你會寫串行代碼來執行同樣的問題。請讓我知道:)

來源

2013-05-30 04:58:09 roynalnaruto

你仍然應該有運行時間的問題 - 轉換不會給出正確的結果。

線條：

uchar4 RGBA = rgbaImage [absolute_image_position_x + absolute_image_position_y];
greyImage [absolute_image_position_x + absolute_image_position_y] = channelSum;

應改爲：

uchar4 RGBA = rgbaImage [absolute_image_position_x + absolute_image_position_y *數numCols];
greyImage [absolute_image_position_x + absolute_image_position_y *數numCols] = channelSum;

來源

2013-10-14 06:50:04 Alex

__global__ 
void rgba_to_greyscale(const uchar4* const rgbaImage, 
         unsigned char* const greyImage, 
         int numRows, int numCols) 
{ 
    int rgba_x = blockIdx.x * blockDim.x + threadIdx.x; 
    int rgba_y = blockIdx.y * blockDim.y + threadIdx.y; 
    int pixel_pos = rgba_x+rgba_y*numCols; 

    uchar4 rgba = rgbaImage[pixel_pos]; 
    unsigned char gray = (unsigned char)(0.299f * rgba.x + 0.587f * rgba.y + 0.114f * rgba.z); 
    greyImage[pixel_pos] = gray; 
} 

void your_rgba_to_greyscale(const uchar4 * const h_rgbaImage, uchar4 * const d_rgbaImage, 
          unsigned char* const d_greyImage, size_t numRows, size_t numCols) 
{ 
    //You must fill in the correct sizes for the blockSize and gridSize 
    //currently only one block with one thread is being launched 
    const dim3 blockSize(24, 24, 1); //TODO 
    const dim3 gridSize(numCols/24+1, numRows/24+1, 1); //TODO 
    rgba_to_greyscale<<<gridSize, blockSize>>>(d_rgbaImage, d_greyImage, numRows, numCols); 

    cudaDeviceSynchronize(); checkCudaErrors(cudaGetLastError()); 
}

來源

2014-03-10 07:15:14 bzhan

儘管你可能會得到正確的答案，但你可以用一種非常奇怪的方式來做到這一點。你傳遞行需要傳遞到你的網格大小的列，並且你的pixel_pos公式不與std綁定。將2d數組展平成1d數組的方法......它應該是numRows * y + x或numCols * x + y，但它全部用作你的網格設置爲cols的b/c，而不是行，cols – labheshr

相同的代碼與處理非標準輸入大小的圖像

int idx=blockDim.x*blockIdx.x+threadIdx.x; 
int idy=blockDim.y*blockIdx.y+threadIdx.y; 

uchar4 rgbcell=rgbaImage[idx*numCols+idy]; 

    greyImage[idx*numCols+idy]=0.299*rgbcell.x+0.587*rgbcell.y+0.114*rgbcell.z; 


    } 

    void your_rgba_to_greyscale(const uchar4 * const h_rgbaImage, uchar4 * const d_rgbaImage, 
         unsigned char* const d_greyImage, size_t numRows, size_t numCols) 
{ 
//You must fill in the correct sizes for the blockSize and gridSize 
//currently only one block with one thread is being launched 

int totalpixels=numRows*numCols; 
int factors[]={2,4,8,16,24,32}; 
vector<int> numbers(factors,factors+sizeof(factors)/sizeof(int)); 
int factor=1; 

    while(!numbers.empty()) 
    { 
if(totalpixels%numbers.back()==0) 
{ 
    factor=numbers.back(); 
    break; 
} 
    else 
    { 
    numbers.pop_back(); 
    } 
} 



const dim3 blockSize(factor, factor, 1); //TODO 
const dim3 gridSize(numRows/factor+1, numCols/factor+1,1); //TODO 
rgba_to_greyscale<<<gridSize, blockSize>>>(d_rgbaImage, d_greyImage, numRows, numCols);

來源

2015-06-25 22:39:59

火線等在這種情況下的libdc1394誤差是不相關的能力 - 它是udacity所使用的庫將您的程序創建的圖像與參考圖像進行比較。而現在說的是，你的圖像和參考圖像之間的差異已經超過了特定的閾值，即該位置即。像素。

來源

2015-07-12 05:17:10 metamorphosis

我最近加入了這門課程，並試圖解決方案，但它不工作的話，我想我自己。你幾乎是正確的。正確的解決方案是：

__global__` 
void rgba_to_greyscale(const uchar4* const rgbaImage, 
       unsigned char* const greyImage, 
       int numRows, int numCols) 
{` 

int pos_x = (blockIdx.x * blockDim.x) + threadIdx.x; 
int pos_y = (blockIdx.y * blockDim.y) + threadIdx.y; 
if(pos_x >= numCols || pos_y >= numRows) 
    return; 

uchar4 rgba = rgbaImage[pos_x + pos_y * numCols]; 
greyImage[pos_x + pos_y * numCols] = (.299f * rgba.x + .587f * rgba.y + .114f * rgba.z); 

}

其餘部分與您的代碼相同。

來源

2015-10-17 14:46:32

你能解釋一下公式：pos_x + pos_y * numCols？ – labheshr

沒關係：這回答了我的問題https://stackoverflow.com/questions/2151084/map-a-2d-array-onto-a-1d-array-c – labheshr

由於您不知道圖像大小。最好選擇螺紋二維塊的任何合理尺寸，然後檢查兩個條件。第一個是，在內核中的pos_x和pos_y指標不超過numRows和numCols。其次，網格大小應該高於所有塊中的總線數。

const dim3 blockSize(16, 16, 1); 
const dim3 gridSize((numCols%16) ? numCols/16+1 : numCols/16, 
(numRows%16) ? numRows/16+1 : numRows/16, 1);

來源

2016-09-06 19:26:55 MuneshSingh

1- int x =(blockIdx.x * blockDim.x) + threadIdx.x;

2- int y = (blockIdx.y * blockDim.y) + threadIdx.y;

而在網格和塊大小

1- const dim3 blockSize(32, 32, 1);

2- const dim3 gridSize((numCols/32+1), (numRows/32+1) , 1);

代碼在0執行。 036992毫秒。

來源

2017-02-09 15:34:04

const dim3 blockSize(16, 16, 1); //TODO 
const dim3 gridSize((numRows+15)/16, (numCols+15)/16, 1); //TODO 

int x = blockIdx.x * blockDim.x + threadIdx.x; 
int y = blockIdx.y * blockDim.y + threadIdx.y; 

uchar4 rgba = rgbaImage[y*numRows + x]; 
float channelSum = .299f * rgba.x + .587f * rgba.y + .114f * rgba.z; 
greyImage[y*numRows + x] = channelSum;

來源

2017-07-19 03:30:43

彩色圖像使用CUDA並行處理灰度圖像

回答

相關問題