這是來自MPI_Gather 2D array的後續問題。這裏的情況是:MPI_Gather()將中心元素合成爲全局矩陣
id = 0 has this submatrix
|16.000000| |11.000000| |12.000000| |15.000000|
|6.000000| |1.000000| |2.000000| |5.000000|
|8.000000| |3.000000| |4.000000| |7.000000|
|14.000000| |9.000000| |10.000000| |13.000000|
-----------------------
id = 1 has this submatrix
|12.000000| |15.000000| |16.000000| |11.000000|
|2.000000| |5.000000| |6.000000| |1.000000|
|4.000000| |7.000000| |8.000000| |3.000000|
|10.000000| |13.000000| |14.000000| |9.000000|
-----------------------
id = 2 has this submatrix
|8.000000| |3.000000| |4.000000| |7.000000|
|14.000000| |9.000000| |10.000000| |13.000000|
|16.000000| |11.000000| |12.000000| |15.000000|
|6.000000| |1.000000| |2.000000| |5.000000|
-----------------------
id = 3 has this submatrix
|4.000000| |7.000000| |8.000000| |3.000000|
|10.000000| |13.000000| |14.000000| |9.000000|
|12.000000| |15.000000| |16.000000| |11.000000|
|2.000000| |5.000000| |6.000000| |1.000000|
-----------------------
The global matrix:
|1.000000| |2.000000| |5.000000| |6.000000|
|3.000000| |4.000000| |7.000000| |8.000000|
|11.000000| |12.000000| |15.000000| |16.000000|
|-3.000000| |-3.000000| |-3.000000| |-3.000000|
我所試圖做的是僅僅收集在全球電網的核心要素(那些沒有在邊界),因此全球電網應該是這樣的:
|1.000000| |2.000000| |5.000000| |6.000000|
|3.000000| |4.000000| |7.000000| |8.000000|
|9.000000| |10.000000| |13.000000| |14.000000|
|11.000000| |12.000000| |15.000000| |16.000000|
而不是我喜歡的那個。這是我的代碼:
float **gridPtr;
float **global_grid;
lengthSubN = N/pSqrt; // N is the dim of global gird and pSqrt the sqrt of the number of processes
MPI_Type_contiguous(lengthSubN, MPI_FLOAT, &rowType);
MPI_Type_commit(&rowType);
if(id == 0) {
MPI_Gather(&gridPtr[1][1], 1, rowType, global_grid[0], 1, rowType, 0, MPI_COMM_WORLD);
MPI_Gather(&gridPtr[2][1], 1, rowType, global_grid[1], 1, rowType, 0, MPI_COMM_WORLD);
} else {
MPI_Gather(&gridPtr[1][1], 1, rowType, NULL, 0, rowType, 0, MPI_COMM_WORLD);
MPI_Gather(&gridPtr[2][1], 1, rowType, NULL, 0, rowType, 0, MPI_COMM_WORLD);
}
...
float** allocate2D(float** A, const int N, const int M) {
int i;
float *t0;
A = malloc(M * sizeof (float*)); /* Allocating pointers */
if(A == NULL)
printf("MALLOC FAILED in A\n");
t0 = malloc(N * M * sizeof (float)); /* Allocating data */
if(t0 == NULL)
printf("MALLOC FAILED in t0\n");
for (i = 0; i < M; i++)
A[i] = t0 + i * (N);
return A;
}
編輯:
這裏是我的嘗試沒有MPI_Gather()
,但子陣:
MPI_Datatype mysubarray;
int starts[2] = {1, 1};
int subsizes[2] = {lengthSubN, lengthSubN};
int bigsizes[2] = {N_glob, M_glob};
MPI_Type_create_subarray(2, bigsizes, subsizes, starts,
MPI_ORDER_C, MPI_FLOAT, &mysubarray);
MPI_Type_commit(&mysubarray);
MPI_Isend(&(gridPtr[0][0]), 1, mysubarray, 0, 3, MPI_COMM_WORLD, &req[0]);
MPI_Type_free(&mysubarray);
MPI_Barrier(MPI_COMM_WORLD);
if(id == 0) {
for(i = 0; i < p; ++i) {
MPI_Irecv(&(global_grid[i][0]), lengthSubN * lengthSubN, MPI_FLOAT, i, 3, MPI_COMM_WORLD, &req[0]);
}
}
if(id == 0)
print(global_grid, N_glob, N_glob);
但結果是:
|1.000000| |2.000000| |3.000000| |4.000000|
|5.000000| |6.000000| |7.000000| |8.000000|
|9.000000| |10.000000| |11.000000| |12.000000|
|13.000000| |14.000000| |15.000000| |16.000000|
這是不是前實際上我想要的。我必須找到一種方法來說明,它應該以另一種方式放置數據。所以,如果我做的:
MPI_Irecv(&(global_grid[0][0]), 1, mysubarray, 0, 3, MPI_COMM_WORLD, &req[0]);
然後我會得到:
|-3.000000| |-3.000000| |-3.000000| |-3.000000|
|-3.000000| |1.000000| |2.000000| |-3.000000|
|-3.000000| |3.000000| |4.000000| |-3.000000|
|-3.000000| |-3.000000| |-3.000000| |-3.000000|
如果您沒有堅持使用指針數組來表示2D數組,那麼這將是一個簡單的練習。爲什麼不使用高音線性內存呢? – talonmies
@talonmies我想增加一個在2012年寫回的代碼,所以它不是我的選擇,真的。但是,我可以在收集操作之前將二維陣列弄平,這不會花費我想象的那麼多。所以如果你對這種方法有一個主張,請隨時發表一個答案。 – gsamaras
例如@talonmies我可以讓每個進程創建一個只包含中心元素的1D數組,例如第4個進程會有'{13,14,15,16}'。但是,我仍不清楚如何繼續。 – gsamaras