屏障呼叫停留在Open MPI（C程序）

我正在通過使用Open MPI消息通信實現屏障同步。我創建了一個名爲容器的結構數組。每個容器都鏈接到其右側的鄰居，並且兩端的兩個元素也鏈接在一起，形成一個圓圈。在main（）測試客戶端中，我使用多個進程（mpiexec -n 5 ./a.out）運行MPI，並且它們應該通過調用barrier（）函數來同步，但是，我的代碼是停留在最後一道工序。我在尋找調試幫助。請參閱下面我的代碼：屏障呼叫停留在Open MPI（C程序）

#include <stdlib.h> 
#include <stdio.h> 
#include <string.h> 
#include <mpi.h> 

typedef struct container { 
    int labels;     
    struct container *linked_to_container;  
    int sense; 
} container; 

container *allcontainers; /* an array for all containers */ 
int size_containers_array; 

int get_next_container_id(int current_container_index, int max_index) 
{ 
    if (max_index - current_container_index >= 1) 
    { 
     return current_container_index + 1; 
    } 
    else 
     return 0;  /* elements at two ends are linked */ 
} 

container *get_container(int index) 
{ 
    return &allcontainers[index]; 
} 


void container_init(int num_containers) 
{ 
    allcontainers = (container *) malloc(num_containers * sizeof(container)); /* is this right to malloc memory on the array of container when the struct size is still unknown?*/ 
    size_containers_array = num_containers; 

    int i; 
    for (i = 0; i < num_containers; i++) 
    { 
     container *current_container = get_container(i); 
     current_container->labels = 0; 
     int next_container_id = get_next_container_id(i, num_containers - 1);  /* max index in all_containers[] is num_containers-1 */ 
     current_container->linked_to_container = get_container(next_container_id); 
     current_container->sense = 0; 
    } 
} 

void container_barrier() 
{ 
    int current_container_id, my_sense = 1; 
    int tag = current_container_id; 
    MPI_Request request[size_containers_array]; 
    MPI_Status status[size_containers_array]; 

    MPI_Comm_rank(MPI_COMM_WORLD, &current_container_id); 
    container *current_container = get_container(current_container_id); 

    int next_container_id = get_next_container_id(current_container_id, size_containers_array - 1); 

    /* send asynchronous message to the next container, wait, then do blocking receive */ 
    MPI_Isend(&my_sense, 1, MPI_INT, next_container_id, tag, MPI_COMM_WORLD, &request[current_container_id]); 
    MPI_Wait(&request[current_container_id], &status[current_container_id]); 
    MPI_Recv(&my_sense, 1, MPI_INT, next_container_id, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE); 

} 

void free_containers() 
{ 
    free(allcontainers); 
} 

int main(int argc, char **argv) 
{ 
    int my_id, num_processes; 
    MPI_Init(&argc, &argv); 
    MPI_Comm_size(MPI_COMM_WORLD, &num_processes); 
    MPI_Comm_rank(MPI_COMM_WORLD, &my_id); 

    container_init(num_processes); 

    printf("Hello world from thread %d of %d \n", my_id, num_processes); 
    container_barrier(); 
    printf("passed barrier \n"); 



    MPI_Finalize(); 
    free_containers(); 

    return 0; 
}

來源

2014-03-03 TonyGW

的問題是一系列調用：

MPI_Isend() 
MPI_Wait() 
MPI_Recv()

這是混亂的常見原因。當您在MPI中使用「非阻塞」調用時，實質上是告訴MPI庫您想對某些數據（my_sense）進行一些操作（發送）。 MPI爲您返回一個MPI_Request對象，並保證在完成功能完成後，該呼叫將完成MPI_Request。

您在這裏遇到的問題是您打電話MPI_Isend並立即呼叫MPI_Wait，然後在任何等級上致電MPI_Recv。這意味着所有這些發送呼叫都會排隊等候，但實際上並沒有任何可以去的地方，因爲你從來沒有通過調用MPI_Recv（它告訴MPI你想把數據放在my_sense中）告訴MPI把數據放在哪裏。

這部分時間的原因是MPI期望事情可能不總是完美地同步。如果您的消息較小（您會這樣做），MPI會保留一些緩衝區空間，並讓您的操作完成，並將數據暫時存儲在臨時空間中，稍後再致電MPI_Recv以告知MPI將數據移到何處。最終，這不會工作了。緩衝區已滿，您需要真正開始接收您的消息。對您而言，這意味着您需要切換您的操作順序。而不是做一個非阻塞發送的，你應該做一個非阻塞接收第一，然後做你的阻塞發送，然後等待你的領取，完成：

MPI_Irecv() 
MPI_Send() 
MPI_Wait()

另一種選擇是將兩個功能放在非阻塞功能和用途MPI_Waitall改爲：

MPI_Isend() 
MPI_Irecv() 
MPI_Waitall()

最後這個選項通常是最好的。唯一需要注意的是你不要覆蓋自己的數據。現在，您對發送和接收操作都使用相同的緩衝區。如果這兩種情況同時發生，則不能保證訂購。通常這沒有什麼區別。無論您先發送郵件還是收到郵件並不重要。但是，在這種情況下，它確實如此。如果您首先收到數據，您將最終再次發回相同的數據，而不是發送接收操作之前的數據。您可以通過使用臨時緩衝區來解決此問題，以便在數據安全時將其移動到正確的位置。

來源

2014-03-03 20:14:53

屏障呼叫停留在Open MPI（C程序）

回答

相關問題