我居然想出如何做到這一點。這個問題應該可能被刪除。但是因爲我是MPI的新手,我會在這裏發佈這些解決方案,如果有人有改進建議,我會很高興,如果他們分享。方法1:
// Fox's algorithm
double * b_buffers[2];
b_buffers[0] = (double *) malloc(n_local*n_local*sizeof(double));
b_buffers[1] = b;
for (stage =0;stage < q; stage++){
//copying a into a_temp and Broadcasting a_temp of each proccess to all other proccess in its row
for (i=0;i< n_local*n_local; i++)
a_temp[i]=a[i];
if (stage == 0) {
MPI_Bcast(a_temp, n_local*n_local, MPI_DOUBLE, (rowID + stage) % q , row_comm);
multiplyMatrix(a_temp,b,c,n_local);
MPI_Isend(b, n_local*n_local, MPI_DOUBLE, nbrs[UP], 111, grid_comm,&my_request1);
MPI_Irecv(b, n_local*n_local, MPI_DOUBLE, nbrs[DOWN], 111, grid_comm,&my_request2);
MPI_Wait(&my_request2, &status);
MPI_Wait(&my_request1, &status);
}
if (stage > 0)
{
//shifting b values in all procces
MPI_Bcast(a_temp, n_local*n_local, MPI_DOUBLE, (rowID + stage) % q , row_comm);
MPI_Isend(b_buffers[(stage)%2], n_local*n_local, MPI_DOUBLE, nbrs[UP], 111, grid_comm,&my_request1);
MPI_Irecv(b_buffers[(stage+1)%2], n_local*n_local, MPI_DOUBLE, nbrs[DOWN], 111, grid_comm,&my_request2);
multiplyMatrix(a_temp, b_buffers[(stage)%2], c, n_local);
MPI_Wait(&my_request2, &status);
MPI_Wait(&my_request1, &status);
}
}
方法2:
// Fox's algorithm
for (stage =0;stage < q; stage++){
//copying a into a_temp and Broadcasting a_temp of each proccess to all other proccess in its row
for (i=0;i< n_local*n_local; i++)
a_temp[i]=a[i];
if (stage == 0) {
MPI_Bcast(a_temp, n_local*n_local, MPI_DOUBLE, (rowID + stage) % q , row_comm);
multiplyMatrix(a_temp,b,c,n_local);
MPI_Isend(b, n_local*n_local, MPI_DOUBLE, nbrs[UP], 111, grid_comm,&my_request1);
MPI_Irecv(b, n_local*n_local, MPI_DOUBLE, nbrs[DOWN], 111, grid_comm,&my_request2);
MPI_Wait(&my_request2, &status);
MPI_Wait(&my_request1, &status);
}
if (stage > 0)
{
//shifting b values in all proccess
memcpy(b_temp, b, n_local*n_local*sizeof(double));
MPI_Bcast(a_temp, n_local*n_local, MPI_DOUBLE, (rowID + stage) % q , row_comm);
MPI_Isend(b, n_local*n_local, MPI_DOUBLE, nbrs[UP], 111, grid_comm,&my_request1);
MPI_Irecv(b, n_local*n_local, MPI_DOUBLE, nbrs[DOWN], 111, grid_comm,&my_request2);
multiplyMatrix(a_temp, b_temp, c, n_local);
MPI_Wait(&my_request2, &status);
MPI_Wait(&my_request1, &status);
}
這兩個似乎工作,但我說我是新來的MPI,如果您有任何意見或建議,請分享。
如果你不使用'status',那麼你可以在一行中使用MPI_STATUS_IGNORE' – 2017-08-06 10:22:52
而不是2'MPI_Wait()',你可以使用一個請求數組,並且可以使用'MPI_Waitall()' 'MPI_STATUSES_IGNORE',如果你不關心狀態。 – 2017-08-06 10:24:09