2017-08-30 32 views
0

我在計算網格上的值的小代碼的順序和MPI版本之間有一個奇怪的結果。MPI vs順序代碼 - 自由數組問題

順序的版本是這樣的:

int main() { 

    /* Array */ 
    double **x; 

    /* Allocation of 2D arrays */ 
    x = malloc(size_tot_y*sizeof(*x)); 

    for (i=0;i<=size_tot_y-1;i++) { 
     x[i] = malloc(size_tot_x*sizeof(**x)); 
    } 

    /* Do various computations */ 

    /* End of code */ 

    /* Free all arrays */ 
    for (i=0;i<=size_tot_y-1;i++) { 
     free(x[i]); 
    } 
    free(x); 

    return 0; 

} 

這個順序版本工作正常,所有陣列(xx0)似乎是一個正確的方式自由。現在

,如果我參加了MPI版本,它看起來像:

int main() { 

    /* Array */ 
    double **x; 
    double *xfinal; 

    /* Allocate size_tot_y rows */ 
    x = malloc(size_tot_y*sizeof(*x)); 

    /* Allocate 2D Contiguous arrays for x */ 
    x[0] = malloc(size_tot_x*size_tot_y*sizeof(**x)); 

    /* Loop on rows */ 
    for (j=1;j<size_tot_y;j++) { 
    /* Increment size_tot_y block on x[i] and x0[i] address */ 
    x[j] = x[0] + j*size_tot_x; 
    } 

     /* Do various computations */ 

     /* End of MPI code */ 

    /* Free all arrays */ 
    for (i=0;i<=size_tot_y-1;i++) { 
     free(x[i]); 
    } 
    free(x); 

    return 0; 

    } 

我得到執行以下錯誤:

[machine1:04130] *** Process received signal *** 
[machine1:04130] Signal: Segmentation fault (11) 
[machine1:04130] Signal code: Address not mapped (1) 
[machine1:04130] Failing at address: 0x7f179c020838 
[machine1:04131] *** Process received signal *** 
[machine1:04131] Signal: Segmentation fault (11) 
[machine1:04131] Signal code: Address not mapped (1) 
[machine1:04131] Failing at address: 0x7ff0b417c838 
[machine1:04132] *** Process received signal *** 
[machine1:04132] Signal: Segmentation fault (11) 
[machine1:04132] Signal code: Address not mapped (1) 
[machine1:04132] Failing at address: 0x7f8560001838 
[machine1:04133] *** Process received signal *** 
[machine1:04133] Signal: Segmentation fault (11) 
[machine1:04133] Signal code: Address not mapped (1) 
[machine1:04133] Failing at address: 0x7f22f415f838 
[machine1:04134] *** Process received signal *** 
[machine1:04140] *** Process received signal *** 

      [machine1:04134] Signal: Segmentation fault (11) 
      [machine1:04134] Signal code: Address not mapped (1) 
      [machine1:04134] Failing at address: 0x7f4e3c0d3838 
      [machine1:04142] *** Process received signal *** 
      [machine1:04142] Signal: Segmentation fault (11) 
      [machine1:04142] Signal code: Address not mapped (1) 
      [machine1:04142] Failing at address: 0x7ff0d4064838 
      [machine1:04140] Signal: Segmentation fault (11) 
      [machine1:04140] Signal code: Address not mapped (1) 
      [machine1:04140] Failing at address: 0x7fb2941c3838 
      [machine1:04129] *** Process received signal *** 
      [machine1:04129] Signal: Segmentation fault (11) 
      [machine1:04129] Signal code: Address not mapped (1) 
      [machine1:04129] Failing at address: 0x7f9150049838 
      [machine1:04142] [machine1:04134] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7f4e48e55890] 
      [machine1:04134] [machine1:04129] [ 0] [machine1:04130] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x[machine1:04131] [ 0] [machine1:04132] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0([machine1:04140] [ 1] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0/lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7f91550a8890] 
      [machine1:04129] [ 1] f890)[0x7f179f424890] 
      [machine1:04130] [ 1] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7ff0b777e890] 
      [machine1:04131] [ 1] [machine1:04133] [ 0] +0xf890)[0x7f8564847890] 
      [machine1:04132] [ 1] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x14)[0x7f4e48b17614] 
      [machine1:04134] (+0xf890)[0x7fb2979c7890] 
      /lib/x86_64-linux-gnu/libc.so.6(cfree+0x14)[0x7f179f0e6614] 
      [machine1:04130] [ 2] ./explicitPar[0x401c48] 
      /lib/x86_64-linux-gnu/libpthread.so.0[ 2] ./explicitPar[0x401c48] 
      [machine1:04134] [ 3] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x14)[0x7f8564509614] 
      [machine1:04132] (+0xf890/lib/x86_64-linux-gnu/libc.so.6(cfree+0x14)[0x7f9154d6a614] 
      [machine1:04129] [machine1:04140] [ 1] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x14)[0x7ff0b7440614] 
      [machine1:04131] [machine1:04130] [ 3] /lib/x86_64-linux-gnu/libc.so.6(/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x[ 2] ./explicitPar[0x401c48] 
      [machine1:04132] [ 3] [ 2] ./explicitPar[0x401c48] 
      [machine1:04129] [ 3] [ 2] ./explicitPar[0x401c48] 
      [machine1:04131] [ 3] __libc_start_main+0xf5)[0x7f179f08bb45] 
      [machine1:04130] [ 4] ./explicitPar[0x400e49] 
      [machine1:04130] *** End of error message *** 
      f5)[0x7f4e48abcb45] 
      [machine1:04134])[0x7f22f8bb2890] 
      [machine1:04133] /lib/x86_64-linux-gnu/libc.so.6/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7ff0b73e5b45[ 4] ./explicitPar[0x400e49] 
      /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[ 1] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f9154d0fb45] 
      [machine1:04129] ] 
      [machine1:04131] [ 4] ./explicitPar[0x7f85644aeb45] 
      [machine1:04132] /lib/x86_64-linux-gnu/libc.so.6(cfree[ 0] [ 4] ./explicitPar[0x400e49] 
      [machine1:04129] *** End of error message *** 
      (cfree+0x14)[0x7fb297689614] 
      [machine1:04140] [ 2] ./explicitPar[0x401c48[machine1:04134] *** End of error message *** 
      [0x400e49] 
      [machine1:04131] *** End of error message *** 
      [ 4] ./explicitPar[0x400e49] 
      [machine1:04132] *** End of error message *** 
      +0x14)[0x7f22f8874614] 
      [machine1:04133] ] 
      [machine1:04140] [ 3] [ 2] ./explicitPar/lib/x86_64-linux-gnu/libc.so.6[0x401c48] 
      [machine1:04133] [ 3] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fb29762eb45] 
      [machine1:04140] [ 4] (__libc_start_main+0xf5)[0x./explicitPar[0x7f22f8819b45] 
      [machine1:04133] 400e49] 
      [machine1:04140] *** End of error message *** 
      [ 4] ./explicitPar[0x400e49] 
      [machine1:04133] *** End of error message *** 
      /lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7ff0d9907890] 
      [machine1:04142] [ 1] -------------------------------------------------------------------------- 
      mpirun noticed that process rank 1 with PID 0 on node machine1 exited on signal 11 (Segmentation fault). 

如果我只是做免費數組:

free(x); 

即,我在這裏評論了部分:

/*for (i=0;i<=size_tot_y-1;i++) { 
     free(x[i]);  
    } 
*/ 

然後,我不會像上面那樣得到錯誤:所以問題來自於在MPI代碼版本中釋放數組的方式。爲什麼第二個表達式釋放數組不好?我會認爲在兩種情況下釋放它們的方式都是相同的,但似乎並非如此。

歡迎任何幫助或評論,問候。

回答

0

陣列分配和解除分配必須是對稱的。

您確實將您的二維數組聲明爲double **,因此它們確實是指向數組double的指針數組。 在順序版本中,您爲列發佈了一個malloc(),然後爲每個行發佈了一個malloc()。你的行不會在連續的內存中,但這很好。

此方法通常對MPI無效,因爲您可能會將您的二維數組傳遞給某些需要連續數據佈局的MPI函數。 因此,您爲列發佈了一個malloc()(迄今爲止沒有任何變化),然後一個單個malloc()全部爲個行。然後你構造了第一個分配的數組,指針指向第二個數組。 因此,釋放二維數組時,您只能發出兩個free()

所以取消分配x陣列的正確方法是

free(x[0]); 
free(x); 
+0

感謝解釋。如果我已經正確理解了,我可以通過「免費(x [i]); free(x0 [i]); 012」這已經被釋放了,不是嗎? – youpilat13

+0

'''free(x [0])'''很好。但例如'''free(x [1])'''是不正確的,因爲''''x [1]'''不是'''malloc()'''返回的地址,所以它不能傳遞給'''free()'''。所以嚴格來說,你試圖釋放不是分配結果的指針。 –

+0

好的,理解,非常感謝! – youpilat13