2016-11-16 64 views
0

您好我想找到當地的最大值爲所有進程,然後所有進程,進程壞終止通過所有的局部最大值到所有進程,使一個單一的陣列,然後使用MPI環形拓撲比較局部最大值,然後輸出全局最大值。MPI_ALLGATHER發送局部最大值來

我可以更有效地做MPI_ALLreduce並且已經做到了,但我想測試環形拓撲的效率,併產生相同的結果allreduce。 我在教程中使用MPI_Allgather,它返回了一些錯誤。代碼如下:

int main(int argc, char **argv) 
{  

    int rank, size; 

    MPI_Init (&argc, &argv);  // initializes MPI 
    //MPI_Comm comm; 
    double max_store[4]; 
    double *rbuf; 
    //MPI_Comm_size(comm, &rank); 
    MPI_Comm_rank (MPI_COMM_WORLD, &rank); // get current MPI-process ID. O, 1, ... 
    MPI_Comm_size (MPI_COMM_WORLD, &size); // get the total number of processes 


    /* define how many integrals */ 
    const int n = 10;  

    double b[n] = {5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0,5.0};      
    double a[n] = {-5.0, -5.0, -5.0, -5.0, -5.0, -5.0, -5.0, -5.0, -5.0,-5.0}; 

    double result, mean; 
    int m; 

    const unsigned int N = 5; 
    double max = -1; 



    cout.precision(6); 
    cout.setf(ios::fixed | ios::showpoint); 


    srand(time(NULL) * rank); // each MPI process gets a unique seed 

    m = 4;    // initial number of intervals 

    // convert command-line input to N = number of points 
    //N = atoi(argv[1]); 


    for (unsigned int i=0; i <=N; i++) 
    { 
     result = int_mcnd(f, a, b, n, m); 
     mean = result/(pow(10,10)); 

     m = m*4; 
     if(mean > max) 
     { 
     max = mean; 
     } 


    } 
    //if (rank < 4 && rank >= 0) 
     //{ 
     //max_store[rank] = max; 
     //} 





    printf("Process ID %i, local_max = %f\n",rank, max); 


    // All processes get the global max, stored in place of the local max 
    MPI_Allreduce(MPI_IN_PLACE, &max, 1, MPI_DOUBLE, MPI_MAX, MPI_COMM_WORLD); 

    printf("Process ID %d, global_max = %f\n",rank, max); 

    rbuf = (double *)malloc(rank*4*sizeof(double)); 
    MPI_Allgather(max_store, 4, MPI_DOUBLE, rbuf, 4, MPI_DOUBLE, MPI_COMM_WORLD); 
    //print the array containing max from each processor 
    //int k;  
    //for(int k = 0; k < 4; k++) 
    //{ 
    //printf("%1.5e\n", max_store[k]); 
    //} 



    double send_junk = max_store[0]; 
    double rec_junk; 
    //double global_max; 
    MPI_Status status; 

    if(rank==0) 
    { 
    MPI_Send(&send_junk, 4, MPI_DOUBLE, 1, 0, MPI_COMM_WORLD); // send data to process 1 
    } 
    if(rank==1) 
    { 
    MPI_Recv(&rec_junk, 4, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &status); // receive data from process 0 
    } 
    //check between process 0 and process 1 maxima 
    if(rec_junk>=max_store[1]) 
    { 
    rec_junk = max_store[0]; 
    } 
    else 
    { 
    rec_junk = max_store[1]; 
    } 
    send_junk = rec_junk; 

    MPI_Send(&send_junk, 4, MPI_DOUBLE, 2, 0, MPI_COMM_WORLD); // send data to process 2 

    if(rank==2) 
    { 
    MPI_Recv(&rec_junk, 4, MPI_DOUBLE, 1, 0, MPI_COMM_WORLD, &status); // receive data from process 1 
    } 
    //check between process 1 and process 2 maxima 
    if(rec_junk>=max_store[2]) 
    { 
    rec_junk = rec_junk; 
    } 
    else 
    { 
    rec_junk = max_store[2]; 
    } 
    send_junk = rec_junk; 

    MPI_Send(&send_junk, 4, MPI_DOUBLE, 3, 0, MPI_COMM_WORLD); // send data to process 3 

    if(rank==3) 
    { 
    MPI_Recv(&rec_junk, 4, MPI_DOUBLE, 2, 0, MPI_COMM_WORLD, &status); // receive data from process 2 
    } 
    //check between process 2 and process 3 maxima 
    if(rec_junk>=max_store[3]) 
    { 
    rec_junk = rec_junk; 
    } 
    else 
    { 
    rec_junk = max_store[3]; 
    } 


    printf("global ring max = %f", rec_junk); 


    MPI_Finalize(); // programs should always perform a "graceful" shutdown 

    return 0; 
} 

我很有興趣知道如何發送的極大值在單個陣列中的所有進程都將有機會獲得它,這樣我可以在一個環形拓撲比較值。非常感謝。

回答

1

您沒有正確分配接收緩衝區。它需要足夠大以存儲每個等級的4個條目。您目前有:

rbuf = (double *)malloc(rank*4*sizeof(double)); 

當它應該是

rbuf = (double *)malloc(size*4*sizeof(double));