嗯,我正在做一些使用MPI + C的作業。事實上,我剛剛編寫了一個由Peter Pacheco的書中的一個小型程序設計代碼3.2,名爲「並行編程入門」。該代碼似乎適用於3或5個進程...但是當我嘗試超過6個進程時,程序中斷。錯誤的等級ID使用MPI_Reduce
我正在使用一種非常「糟糕的」調試方法,即將一些printfs追蹤出現問題的地方。使用這種「方法」,我發現在MPI_Reduce之後,會出現一些奇怪的行爲,並且我的程序會對行列ID感到困惑,特別是排名0消失,並且出現一個非常大(錯誤)的排名。
我的代碼的下方,之後,我張貼3點9的過程輸出......我與
mpiexec -n X ./name_of_program
其中X是進程的數目運行。
我的代碼:現在
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int main(void)
{
MPI_Init(NULL,NULL);
long long int local_toss=0, local_num_tosses=-1, local_tosses_in_circle=0, global_tosses_in_circle=0;
double local_x=0.0,local_y=0.0,pi_estimate=0.0;
int comm_sz, my_rank;
MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
if (my_rank == 0) {
printf("\nEnter the number of dart tosses: ");
fflush(stdout);
scanf("%lld",&local_num_tosses);
fflush(stdout);
}
//
MPI_Barrier(MPI_COMM_WORLD);
MPI_Bcast(&local_num_tosses, 1, MPI_LONG_LONG_INT, 0, MPI_COMM_WORLD);
MPI_Barrier(MPI_COMM_WORLD);
srand(rand()); //tried to improve randomness here!
for (local_toss=0;local_toss<local_num_tosses;local_toss++) {
local_x = (-1) + (double)rand()/(RAND_MAX/2);
local_y = (-1) + (double)rand()/(RAND_MAX/2);
if ((local_x*local_x + local_y*local_y) <= 1) {local_tosses_in_circle++;}
}
MPI_Barrier(MPI_COMM_WORLD);
MPI_Reduce
(
&local_tosses_in_circle,
&global_tosses_in_circle,
comm_sz,
MPI_LONG_LONG_INT,
MPI_SUM,
0,
MPI_COMM_WORLD
);
printf("\n\nDEBUG: myrank = %d, comm_size = %d",my_rank,comm_sz);
fflush(stdout);
MPI_Barrier(MPI_COMM_WORLD);
if (my_rank == 0) {
pi_estimate = ((double)(4*global_tosses_in_circle))/((double) comm_sz*local_num_tosses);
printf("\nPi estimate = %1.5lf \n",pi_estimate);
fflush(stdout);
}
MPI_Finalize();
return 0;
}
,2個輸出:
(ⅰ)對於3個工序:
Enter the number of dart tosses: 1000000
DEBUG: myrank = 0, comm_size = 3
DEBUG: myrank = 1, comm_size = 3
DEBUG: myrank = 2, comm_size = 3
Pi estimate = 3.14296
(ⅱ)對於圖9點的過程:(請注意,成\ n輸出是奇怪的,有時它不起作用)
Enter the number of dart tosses: 10000000
DEBUG: myrank = 1, comm_size = 9
DEBUG: myrank = 7, comm_size = 9
DEBUG: myrank = 3, comm_size = 9
DEBUG: myrank = 2, comm_size = 9DEBUG: myrank = 5, comm_size = 9
DEBUG: myrank = 8, comm_size = 9
DEBUG: myrank = 6, comm_size = 9
DEBUG: myrank = 4, comm_size = 9DEBUG: myrank = -3532887, comm_size = 141598939[PC:06511] *** Process received signal ***
[PC:06511] Signal: Segmentation fault (11)
[PC:06511] Signal code: (128)
[PC:06511] Failing at address: (nil)
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 6511 on node PC exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
非常好的拉斐爾,非常感謝!它解決了我的問題!實際上,MPI_Reduce的第三個參數是數據的大小..如果它不是一個向量,正確的值是1 ...真的!我感覺很愚蠢:-(笑lol 關於障礙,我想你一樣......但看看[這個](http://stackoverflow.com/questions/9284419/is-mpi-reduce -blocking-or-a-natural-barrier)Stack Overflow post ..它真的讓人困惑! – guipy 2013-04-08 17:58:15
我看到了啤酒發佈的地方,我可能會進入並回答你的問題,如果你沒有執行MPI_Reduce一個循環,就像啤酒提到的那樣,放下MPI_Brier一定是好的。 – 2013-04-08 21:39:42
非常感謝拉斐爾,爲了幫助我的努力!非常感謝! – guipy 2013-04-08 22:49:06