2016-10-10 113 views
9

本程序使用C拉格朗日和MPI編寫。我是MPI的新手,想要使用所有處理器進行一些計算,包括進程0.要學習這個概念,我寫了下面這個簡單的程序。但這個程序從進程0接收輸入後掛在底部,不會將結果發送迴流程0如何使用所有處理器在MPI中發送/接收

#include <mpi.h> 
#include <stdio.h> 

int main(int argc, char** argv) {  
    MPI_Init(&argc, &argv); 
    int world_rank; 
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank); 
    int world_size; 
    MPI_Comm_size(MPI_COMM_WORLD, &world_size); 

    int number; 
    int result; 
    if (world_rank == 0) 
    { 
     number = -2; 
     int i; 
     for(i = 0; i < 4; i++) 
     { 
      MPI_Send(&number, 1, MPI_INT, i, 0, MPI_COMM_WORLD); 
     } 
     for(i = 0; i < 4; i++) 
     {   /*Error: can't get result send by other processos bellow*/ 
      MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE); 
      printf("Process 0 received number %d from i:%d\n", number, i); 
     } 
    } 
    /*I want to do this without using an else statement here, so that I can use process 0 to do some calculations as well*/ 

    MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE); 
    printf("*Process %d received number %d from process 0\n",world_rank, number); 
    result = world_rank + 1; 
    MPI_Send(&result, 1, MPI_INT, 0, 99, MPI_COMM_WORLD); /* problem happens here when trying to send result back to process 0*/ 

    MPI_Finalize(); 
} 

乳寧,並得到結果:如果你能

:$ mpicc test.c -o test 
:$ mpirun -np 4 test 

*Process 1 received number -2 from process 0 
*Process 2 received number -2 from process 0 
*Process 3 received number -2 from process 0 
/* hangs here and will not continue */ 

,請給我一個例子或者如果可能的話編輯上面的代碼。

回答

1

我真的不明白使用2 if語句時會出現什麼問題,並且圍繞着工作域。但無論如何,這是一個可以做什麼的例子。

我修改了您的代碼以使用集體通信,因爲它們比您使用的一系列發送/接收更有意義。由於初始通信具有統一的值,因此我使用一個MPI_Bcast(),它在一次呼叫中執行相同操作。
相反,由於結果值都不相同,因此致電MPI_Gather()是非常合適的。
我還引入了一個sleep()的調用,目的只是爲了模擬這些過程正在運行一段時間,然後再發回它們的結果。

現在,該代碼如下所示:

$ mpicc -std=c99 simple_mpi.c -o simple_mpi 

它運行,並給出了這樣的:

#include <mpi.h> 
#include <stdlib.h> // for malloc and free 
#include <stdio.h> // for printf 
#include <unistd.h> // for sleep 

int main(int argc, char *argv[]) { 

    MPI_Init(&argc, &argv); 
    int world_rank; 
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank); 
    int world_size; 
    MPI_Comm_size(MPI_COMM_WORLD, &world_size); 

    // sending the same number to all processes via broadcast from process 0 
    int number = world_rank == 0 ? -2 : 0; 
    MPI_Bcast(&number, 1, MPI_INT, 0, MPI_COMM_WORLD); 
    printf("Process %d received %d from process 0\n", world_rank, number); 

    // Do something usefull here 
    sleep(1); 
    int my_result = world_rank + 1; 

    // Now collecting individual results on process 0 
    int *results = world_rank == 0 ? malloc(world_size * sizeof(int)) : NULL; 
    MPI_Gather(&my_result, 1, MPI_INT, results, 1, MPI_INT, 0, MPI_COMM_WORLD); 

    // Process 0 prints what it collected 
    if (world_rank == 0) { 
     for (int i = 0; i < world_size; i++) { 
      printf("Process 0 received result %d from process %d\n", results[i], i); 
     } 
     free(results); 
    } 

    MPI_Finalize(); 

    return 0; 
} 

如下編譯它之後

$ mpiexec -n 4 ./simple_mpi 
Process 0 received -2 from process 0 
Process 1 received -2 from process 0 
Process 3 received -2 from process 0 
Process 2 received -2 from process 0 
Process 0 received result 1 from process 0 
Process 0 received result 2 from process 1 
Process 0 received result 3 from process 2 
Process 0 received result 4 from process 3 
1

其實,處理1-3確實將結果發送回處理器0.但是,處理器0停留在此循環的第一次迭代中:

for(i=0; i<4; i++) 
{  
    MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE); 
    printf("Process 0 received number %d from i:%d\n", number, i); 
} 

在第一次MPI_Recv調用中,處理器0將阻止等待接收來自其自身的帶有標記99的消息,這是0還沒有發送的消息。

通常,處理器向自己發送/接收消息,特別是使用阻塞調用是一個壞主意。 0已經具有內存中的值。它不需要發送給自己。

然而,一個解決方法是從i=1

for(i=1; i<4; i++) 
{   
    MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE); 
    printf("Process 0 received number %d from i:%d\n", number, i); 
} 

開始接收環路現在運行代碼會給你:

Process 1 received number -2 from process 0 
Process 2 received number -2 from process 0 
Process 3 received number -2 from process 0 
Process 0 received number 2 from i:1 
Process 0 received number 3 from i:2 
Process 0 received number 4 from i:3 
Process 0 received number -2 from process 0 

注意,使用MPI_Bcast和MPI_Gather由吉爾提到的是數據分發/收集的效率和標準方式更爲有效和標準。