2012-12-12 95 views
1

我做了一個生命遊戲的順序版本,但現在我需要使用OpenMP創建一個並行版本的代碼,但我遇到了一些問題。 如果有人能幫助我,那將會非常好。 THKS。 這裏是我的順序代碼:OpenMP的生命遊戲

// Swapping the two grids 
#define SWAP_BOARDS(b1, b2) do { \ 
char* temp = b1; \ 
b1 = b2; \ 
b2 = temp; \ 
} while(0) 

// Simplifying access to grid elements 
    #define BOARD(G, X, Y) ((G)[NC*(X)+(Y)]) 

char* sequential_game_of_life (char* outgrid, char* ingrid, 
     const int nrows, const int ncols, const int gens_max) { 

    const int NC = ncols; 
    int curgen, i, j; 

for (curgen = 0; curgen < gens_max; curgen++) 
    { 

    for (i = 0; i < nrows; i++) 
{ 
    for (j = 0; j < ncols; j++) 
    { 
     const int inorth = mod (i-1, nrows); 
     const int isouth = mod (i+1, nrows); 
     const int jwest = mod (j-1, ncols); 
     const int jeast = mod (j+1, ncols); 

     const char neighbor_count = 
    BOARD (ingrid, inorth, jwest) + 
    BOARD (ingrid, inorth, j) + 
    BOARD (ingrid, inorth, jeast) + 
    BOARD (ingrid, i, jwest) + 
    BOARD (ingrid, i, jeast) + 
    BOARD (ingrid, isouth, jwest) + 
    BOARD (ingrid, isouth, j) + 
    BOARD (ingrid, isouth, jeast); 

     BOARD(outgrid, i, j) = alivep (neighbor_count, BOARD (ingrid, i, j)); 
    } 
} 
    SWAP_BOARDS(outgrid, ingrid); 
} 
    return outgrid; 
} 

我知道,我必須爲平行的那些3,但我不能看到如何做到這一點。

+0

那麼問題是什麼?是不是像「我怎樣才能讓3'for'循環與OpenMP並行運行?」或類似的東西? – netcoder

回答

4

我覺得外環不能並行,因爲輸入到每一代前代,所以它有一個順序式(至少你不能用細微的變化做到這一點!)

在的情況下,嵌套循環遍歷矩陣或類似的東西,我喜歡從0ncol*nrow(在你的情況)運行一個循環,並從循環索引中找到ij

這樣的:

// because you are running a parallel codes multiple times in a loop, 
// it would be better to make the thread swarm first and schedule the 
// tasks in each loop iteration, to avoid multiple creation and destruction 
// of working threads 
#pragma omp parallel 
for (curgen = 0; curgen < gens_max; curgen++) 
{ 
    #pragma omp for 
    for (t = 0; t < nrows*ncols; t++) 
    { 
     int i = t/ncols; 
     int j = t % ncols; 
     const int inorth = mod (i-1, nrows); 
     const int isouth = mod (i+1, nrows); 
     const int jwest = mod (j-1, ncols); 
     const int jeast = mod (j+1, ncols); 

     const char neighbor_count = 
      BOARD (ingrid, inorth, jwest) + 
      BOARD (ingrid, inorth, j) + 
      BOARD (ingrid, inorth, jeast) + 
      BOARD (ingrid, i, jwest) + 
      BOARD (ingrid, i, jeast) + 
      BOARD (ingrid, isouth, jwest) + 
      BOARD (ingrid, isouth, j) + 
      BOARD (ingrid, isouth, jeast); 

     BOARD(outgrid, i, j) = alivep (neighbor_count, BOARD (ingrid, i, j)); 
    } 
    SWAP_BOARDS(outgrid, ingrid); 
} 

我跑在我的筆記本電腦上的1000×1000矩陣雙核2.53 GHz的CPU超過1000代這個代碼,並獲得69%的速度了。

+0

非常好,但我在這裏有一些疑問,實際上是兩個: 第一:我不得不定義一個類似#pragma omp parallel num_threads(NTHREADS)private(t)的並行區域嗎? 第二:當我用你的建議運行算法時,使用1個線程比使用4更快。任何猜測爲什麼會發生這種情況? – tsukanomon

+1

'#pragma omp parallel for'是'#pragma omp parallel'的快捷方式,然後是'#pragma omp for'。您也可以在其中指定其他選項,如:'#pragma omp parallel for num_threads(N)'。爲加快速度,請檢查更新的答案 – saeedn

+1

您必須給予'curgen'私人待遇。當有'collapse(n)'OpenMP指令時,手動循環崩潰是多餘的。 –