2014-01-24 53 views
0

我有以下代碼:MPI_Waitall錯誤:地址不映射

#include <stdio.h> 
#include <stdlib.h> 
#include <string.h> 
#include <mpi.h> 

static int rank, size; 

char msg[] = "This is a test message"; 

int main(int argc, char **argv) { 
    MPI_Init(&argc, &argv); 

    MPI_Comm_rank(MPI_COMM_WORLD, &rank); 
    MPI_Comm_size(MPI_COMM_WORLD, &size); 

    if (size != 2) { 
     fprintf(stderr, "This test requires exactly 2 tasks (has: %d).\n", size); 
     MPI_Finalize(); 
     return -1; 
    } 

    int run = 1; 
    if (argc > 1) { 
     run = atoi(argv[1]); 
    } 

    int len = strlen(msg) + 1; 
    if (argc > 2) { 
     len = atoi(argv[2]); 
    } 

    char buf[len]; 

    strncpy(buf, msg, len); 

    MPI_Status statusArray[run]; 

    MPI_Request reqArray[run]; 


    double start = MPI_Wtime(); 

    for (int i = 0; i < run; i++) { 
     if (!rank) { 
      MPI_Isend(buf, len, MPI_CHAR, 1, 0, MPI_COMM_WORLD, &reqArray[i]); 
      printf("mpi_isend for run %d\n", i); 
     } else { 
      MPI_Irecv(buf, len, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &reqArray[i]); 
      printf("mpi_irecv for run %d\n", i); 
     } 
    } 
    int buflen = 512; 
    char name[buflen]; 
    gethostname(name, buflen); 
    printf("host: %s has rank %d\n", name, rank); 
    printf("Reached here! for host %s before MPI_Waitall \n", name); 
    if(!rank) { 
     printf("calling mpi_waitall for sending side which is %s\n", name); 
     MPI_Waitall(run, &reqArray[0], &statusArray[0]); 
    } 
    else { 
     printf("calling mpi_waitall for receiving side which is %s\n", name); 
     MPI_Waitall(run, &reqArray[0], &statusArray[0]); 
    } 
    printf("finished waiting! for host %s\n", name); 
    double end = MPI_Wtime(); 
    if (!rank) { 
     printf("Throughput: %.4f Gbps\n", 1e-9 * len * 8 * run/(end - start)); 
    } 

    MPI_Finalize(); 
} 

MPI_Waitall之前有在發送側的賽格故障。該錯誤信息是:

[host1:27679] *** Process received signal *** 
[host1:27679] Signal: Segmentation fault (11) 
[host1:27679] Signal code: Address not mapped (1) 
[host1:27679] Failing at address: 0x8 
[host1:27679] [ 0] /lib64/libpthread.so.0() [0x3ce7e0f500] 
[host1:27679] [ 1] /usr/lib64/openmpi/mca_btl_openib.so(+0x21dc7) [0x7f46695c1dc7] 
[host1:27679] [ 2] /usr/lib64/openmpi/mca_btl_openib.so(+0x1cbe1) [0x7f46695bcbe1] 
[host1:27679] [ 3] /lib64/libpthread.so.0() [0x3ce7e07851] 
[host1:27679] [ 4] /lib64/libc.so.6(clone+0x6d) [0x3ce76e811d] 
[host1:27679] *** End of error message *** 

我覺得有一些錯誤的MPI_Request陣列。有人能指出嗎? 謝謝!

+0

失敗案例中'run'的價值是什麼? –

回答

2

我跑了你的程序沒有問題(除了警告不包括unistd.h)。該問題可能與您的Open MPI設置有關。您是否在使用具有InfiniBand網絡的計算機?如果不是,你可能想要改變爲只使用默認的tcp實現。你的問題可能與此有關。

如果要指定,您只能使用TCP,您應該像這樣運行:

mpirun --mca btl tcp,self -n 2 <prog_name> <prog_args> 

這將確保openib不會意外地發現,在使用時它不應該是。

另一方面,如果您打算使用InfiniBand,則可能發現Open MPI存在某種問題。我懷疑是這種情況,因爲你沒有做任何事情。

+0

是的,我運行在具有IB網絡的機器上,我打算使用該接口。 – Ra1nWarden

+0

在這種情況下,它可能與Open MPI有關。我重申了這個問題,添加了Open MPI標籤,希望這些人中的一個能很快出現並提供幫助。 –

+0

如果您在此沒有得到回覆,您也可以將您的問題發佈到Open MPI用戶郵件列表(http://www.open-mpi.org/community/lists/ompi.php)。 –