2014-12-20 86 views
0

我試圖用奇數 - 即使換位排序隨機數字數組,但運行我的代碼時,我不斷收到一個分割錯誤:MPI調試段錯誤

[islb:48966] *** Process received signal *** 
[islb:48966] Signal: Segmentation fault (11) 
[islb:48966] Signal code: Address not mapped (1) 
[islb:48966] Failing at address: 0x28 
[islb:48966] [ 0] /lib64/libpthread.so.0(+0xf810)[0x7fc3da4cb810] 
[islb:48966] [ 1] /lib64/libc.so.6(memcpy+0xa3)[0x7fc3da1c7cf3] 
[islb:48966] [ 2] /usr/local/lib/libopen-pal.so.6(opal_convertor_unpack+0x10b)[0x7fc3d9c372db] 
[islb:48966] [ 3] /usr/local/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_request_progress_match+0x138)[0x7fc3d58507a8] 
[islb:48966] [ 4] /usr/local/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_req_start+0x1b1)[0x7fc3d5850d11] 
[islb:48966] [ 5] /usr/local/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv+0x139)[0x7fc3d5849489] 
[islb:48966] [ 6] /usr/local/lib/libmpi.so.1(MPI_Recv+0xc0)[0x7fc3da742f40] 
[islb:48966] [ 7] oddEven[0x40115a] 
[islb:48966] [ 8] /lib64/libc.so.6(__libc_start_main+0xe6)[0x7fc3da161c36] 
[islb:48966] [ 9] oddEven[0x400c19] 
[islb:48966] *** End of error message *** 
-------------------------------------------------------------------------- 
mpirun noticed that process rank 1 with PID 48966 on node islb exited on signal 11 (Segmentation fault). 
-------------------------------------------------------------------------- 

程序分配的陣列,它的當它將它分散到錯誤似乎發生的進程中,因爲直接在分散調用之後打印統計信息僅打印進程0,然後打印錯誤消息。

這裏是我的代碼:

#include <stdio.h> 
#include <math.h> 
#include <malloc.h> 
#include <time.h> 
#include <string.h> 
#include "mpi.h" 

const int MAX = 10000; 
int myid, numprocs; 
int i, n, j, k, arrayChunk, minindex; 
int A, B; 
int temp; 

int swap(int *x, int *y) { 
    temp = *x; 
    *x = *y; 
    *y = temp; 
    return 0; 
} 

int main(int argc, char **argv) { 
    int* arr = NULL; 
    int* value = NULL; 
    MPI_Status status; 
    //int arr[] = {16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1}; 

    srand(time(0)); 
    time_t t1, t2; 

    MPI_Init(&argc, &argv); 
    MPI_Comm_rank(MPI_COMM_WORLD, &myid); 
    MPI_Comm_size(MPI_COMM_WORLD, &numprocs); 

    if (myid == 0) { 
    printf("Enter the number of elements you would like in the array \n"); 
    scanf("%d", &n); 

    arrayChunk = n/numprocs; 
    //printf("cpus: %d, #s per cpu: %d\n", numprocs, arrayChunk); 

    //Allocate memory for the array 
    arr = malloc(n * sizeof(int)); 
    value = malloc(n * sizeof(int)); 

    // Generate an array of size n random numbers and prints them 
    printf("Elements in the array: "); 
    for (i = 0; i < n; i++) { 
     arr[i] = (rand() % 100) + 1; 
     printf("%d ", arr[i]); 
    } 
    printf("\n"); 
    time(&t1); 
    } 

    if ((n % numprocs) != 0) { 
    if (myid == 0) 
     printf("Number of Elements are not divisible by numprocs \n"); 
    MPI_Finalize(); 
    return(0); 
    } 

    // Broadcast the size of each chunk 
    MPI_Bcast(&arrayChunk, 1, MPI_INT, 0, MPI_COMM_WORLD); 
    MPI_Scatter(&arr, arrayChunk, MPI_INT, &value, arrayChunk, MPI_INT, 0, MPI_COMM_WORLD); 
    printf("Processor %d receives %d\n", myid, value[0]); 

    for (i = 0; i < numprocs; i++) { 
    if (i % 2 == 0) { 
     if (myid%2 == 0) { 
     MPI_Send(&value[0], arrayChunk, MPI_INT, myid + 1, 0, MPI_COMM_WORLD); 
     MPI_Recv(&value[arrayChunk], arrayChunk, MPI_INT, myid + 1, 0, MPI_COMM_WORLD, &status); 

     for (j = 0; j < (arrayChunk * 2 - 1); j++) { 
      minindex = j; 
      for (k = j + 1; k < arrayChunk * 2; k++) { 
      if (value[k] < value[minindex]) { 
       minindex = k; 
      } 
      } 
      if (minindex > j) { 
      swap(&value[j], &value[minindex]); 
      } 
     } 
     //printf("myid %d i: %d, %d\n", myid, i, value[0]); 
     } else { 
     MPI_Recv(&value[arrayChunk], arrayChunk, MPI_INT, myid - 1, 0, MPI_COMM_WORLD, &status); 
     MPI_Send(&value[0], arrayChunk, MPI_INT, myid - 1, 0, MPI_COMM_WORLD); 

     for (j = 0; j < (arrayChunk * 2 - 1); j++) { 
      minindex = j; 
      for (k = j + 1; k < arrayChunk * 2; k++) { 
      if (value[k] < value[minindex]) { 
       minindex = k; 
      } 
      } 
      if (minindex > j) { 
      swap(&value[j], &value[minindex]); 
      } 
     } 

     for (j = 0; j < arrayChunk; j++) { 
     swap(&value[j], &value[j + arrayChunk]); 
     } 
     //printf("myid %d i: %d, %d\n", myid, i, value[0]); 
     } 
    } else { 
     if ((myid%2 == 1) && (myid != (numprocs-1))) { 
     MPI_Send(&value[0], arrayChunk, MPI_INT, myid + 1, 0, MPI_COMM_WORLD); 
     MPI_Recv(&value[arrayChunk], arrayChunk, MPI_INT, myid + 1, 0, MPI_COMM_WORLD, &status); 

     for (j = 0; j < (arrayChunk * 2 - 1); j++) { 
      minindex = j; 
      for (k = j + 1; k < arrayChunk * 2; k++) { 
      if (value[k] < value[minindex]) { 
       minindex = k; 
      } 
      } 
      if (minindex > j) { 
      swap(&value[j], &value[minindex]); 
      } 
     } 
     //printf("myid %d i: %d, %d\n", myid, i, value[0]); 
     } else if (myid != 0 && myid != (numprocs-1)) { 
     MPI_Recv(&value[arrayChunk], arrayChunk, MPI_INT, myid - 1, 0, MPI_COMM_WORLD, &status); 
     MPI_Send(&value[0], 1, MPI_INT, myid - 1, 0, MPI_COMM_WORLD); 

     for (j = 0; j < (arrayChunk * 2 - 1); j++) { 
      minindex = j; 
      for (k = j + 1; k < arrayChunk * 2; k++) { 
      if (value[k] < value[minindex]) { 
       minindex = k; 
      } 
      } 
      if (minindex > j) { 
      swap(&value[j], &value[minindex]); 
      } 
     } 

     for (j = 0; j < arrayChunk; j++) { 
      swap(&value[j], &value[j + arrayChunk]); 
     } 
     //printf("myid %d i: %d, %d\n", myid, i, value[0]); 
     } 
    } 
    } 

    MPI_Gather(&value[0], arrayChunk, MPI_INT, &arr[0], arrayChunk, MPI_INT, 0, MPI_COMM_WORLD); 

    if (myid == 0) { 
    time(&t2); 
    printf("Sorted array: "); 
    for (i = 0; i < n; i++) { 
     printf("%d ", arr[i]); 
    } 
    printf("\n"); 
    printf("Time in sec. %f\n", difftime(t2, t1)); 
    } 

    // Free allocated memory 
    if (arr != NULL) { 
    free(arr); 
    arr = NULL; 

    free(value); 
    value = NULL; 
    } 
    MPI_Finalize(); 
    return 0; 
} 

我不是很熟悉C,它很可能是,我用malloc和/或地址和指針錯誤,因此它可能是簡單的東西。

對不起,我認爲最好是提供所有的代碼,以便進行適當的調試。

+0

如果您提供可再現問題的_minimal_示例,則您更有可能獲得幫助。 –

+0

這一行裏面的swap()函數:'temp = * x;'正在使用全局變量'temp'使用本地/自動變量會更好:'int temp = * x;'注意:3個獨佔或操作會更快1)更快2)不需要'temp'的任何堆棧空間 – user3629249

+0

這一行:arrayChunk = n/numprocs;正在執行整數除法。如果numprocs大於用戶輸入的'n',那麼結果將永遠爲0 – user3629249

回答

0

的問題是在你的MPI_Scatter命令。您嘗試分散信息並存儲在value中,但如果您查看該代碼上面的內容,則只有0級已爲value分配任何內存。當任何和所有其他隊列嘗試將數據存儲到value時,您將會遇到分段錯誤(實際上您會這樣做)。請從if塊內刪除value = malloc(...);行,並將其放在MPI_Bcast之後,作爲value = malloc(arrayChunk * sizeof(int));。我沒有仔細查看代碼的其餘部分,看看其他地方是否還有其他問題,但這可能是導致初始分段錯誤的原因。

0

我會用調試信息編譯程序(很可能是-g編譯標誌),請嘗試geting coredump並嘗試使用gdb調試器來找到該錯誤。 Corefile是在進程崩潰時創建的,它在崩潰時保存進程內存映像。

如果程序崩潰後coredump文件沒有被創建,你需要弄清楚如何在你的系統上啓用它。您可以創建簡單的越野車程序(例如使用a=x/0;或類似的錯誤)並播放一下。 Coredump可能被稱爲core,PID.core(PID - 崩潰進程的數量)或類似的東西。有時足以設置核心文件大小 tu 無限制使用ulimit。還請在Linux上檢查kernel.core_* sysctl's。

一旦你有corecump,您可以用gdb或類似debuger(ddd)使用它:

gdb executable_file core