2013-08-07 125 views
0

我正在嘗試編寫一些CUDA代碼來計算最長的公用子序列。我不能工作了如何使線程休眠,直到依存關係來計算它的細胞被滿足:睡覺/在CUDA線程中等待

// Ignore the spurious maths here, very messy data structures. Planning ahead to strings that are bigger then GPU blocks. i & j are correct though. 
int real_i = blockDim.x * blockIdx.x + threadIdx.x; 
int real_j = blockDim.y * (max_offset - blockIdx.x) + threadIdx.y; 

char i_char = seq1[real_i]; 
char j_char = seq2[real_j]; 

// For i & j = 1 to length 
if((real_i > 0 && real_j > 0) && (real_i < sequence_length && real_j < sequence_length) { 

    printf("i: %d, j: %d\n", real_i, real_j); 
    printf("I need to wait for dependancy at i: %d j: %d and i: %d j: %d\n", real_i, (real_j - 1), real_i - 1, real_j); 
    printf("Is this true? %d\n", (depend[sequence_length * real_i + (real_j - 1)] && depend[sequence_length * (real_i - 1) + real_j])); 

    //WAIT FOR DEPENDENCY TO BE SATISFIED 
    //THIS IS WHERE I NEED THE CODE TO HANG 
    while((depend[sequence_length * real_i + (real_j - 1)] == false) && (depend[sequence_length * (real_i - 1) + real_j] == false)) { 
    } 

    if (i_char == j_char) 
     c[sequence_length * real_i + real_j] = (c[sequence_length * (real_i - 1) + (real_j - 1)]) + 1; 
    else 
     c[sequence_length * real_i + real_j] = max(c[sequence_length * real_i + (real_j - 1)], c[sequence_length * (real_i - 1) + real_j]); 

    // SETTING THESE TO TRUE SHOULD ALLOW OTHER THREADS TO BREAK PAST THE WHILE BLOCK 
    depend[sequence_length * real_i + (real_j - 1)] = true; 
    depend[sequence_length * (real_i - 1) + real_j] = true; 
} 

所以基本上線程應該在while循環掛起,直到它的依賴,滿足在移入計算代碼之前由其他線程執行。

我知道「第一」線程都有它的依賴性來滿足它打印

real i 1, real j 1 
I need to wait for dependancy at i: 1 j: 0 and i: 0 j: 1 
Is this true? 1 

曾經它已經完成它的計算設置了一些細胞依賴矩陣爲true,允許2個線程,讓過去的同時,循環和內核從那裏移動。

但是,如果我去掉while循環我的整個系統掛起〜10秒,我得到

the launch timed out and was terminated 

有什麼建議?

回答

1

睡眠不好主意,更好地等待條件變量或互斥鎖。

在GPU上,每個條件語句都非常昂貴。所以如果可以的話,嘗試並行化所有代碼。爲了確保代碼被完成了所有的線程可以使用__syncthreads()

如果你還是想用最簡單的方法添加互斥體,但它通常壞主意