2017-02-16 32 views
1

我有一個應用程序使用OpenMPI並在Windows和Linux上啓動它。 Windows的版本工作正常,但是,在Linux上運行導致內存分配錯誤。某些應用參數會出現問題,需要更多計算。 爲了消除內存泄漏,我使用Valgrind檢查了Linux版本的應用程序,並得到了一些output。畢竟,我試圖搜索關於輸出的信息,並在堆棧溢出和GitHub上發現了一些帖子(沒有足夠的信譽來附加鏈接)。畢竟,我更新了openMPI到2.0.2並再次檢查應用程序。新output。 OpenMPI中存在內存泄漏還是我做錯了什麼?有人可以用開放的mpi解釋這個valgrind輸出嗎?

一塊輸出的:

==16210== 4 bytes in 1 blocks are definitely lost in loss record 5 of 327 
==16210== at 0x4C2DBB6: malloc (vg_replace_malloc.c:299) 
==16210== by 0x5657A59: strdup (strdup.c:42) 
==16210== by 0x51128E6: opal_basename (in /home/vshmelev/OMPI_2.0.2/lib/libopen-pal.so.20.2.0) 
==16210== by 0x7DDECA9: ??? 
==16210== by 0x7DDEDD4: ??? 
==16210== by 0x6FBFF84: ??? 
==16210== by 0x4E4EA9E: orte_init (in /home/vshmelev/OMPI_2.0.2/lib/libopen-rte.so.20.1.0) 
==16210== by 0x4041FD: orterun (orterun.c:818) 
==16210== by 0x4034E5: main (main.c:13) 

的openmpi版本:打開MPI:2.0.2
Valgrind的版本:的valgrind-3.12.0
虛擬mashine特徵:Ubuntu的16.04 LTS 64

在使用MPICH的情況下,Valgrind輸出是:

==87863== HEAP SUMMARY: 
==87863==  in use at exit: 131,120 bytes in 2 blocks 
==87863== total heap usage: 2,577 allocs, 2,575 frees, 279,908 bytes allocated 
==87863== 
==87863== 131,120 bytes in 2 blocks are still reachable in loss record 1 of 1 
==87863== at 0x4C2DBB6: malloc (vg_replace_malloc.c:299) 
==87863== by 0x425803: alloc_fwd_hash (sock.c:332) 
==87863== by 0x425803: HYDU_sock_forward_stdio (sock.c:376) 
==87863== by 0x432A99: HYDT_bscu_stdio_cb (bscu_cb.c:19) 
==87863== by 0x42D9BF: HYDT_dmxu_poll_wait_for_event (demux_poll.c:75) 
==87863== by 0x42889F: HYDT_bscu_wait_for_completion (bscu_wait.c:60) 
==87863== by 0x42863C: HYDT_bsci_wait_for_completion (bsci_wait.c:21) 
==87863== by 0x40B123: HYD_pmci_wait_for_completion (pmiserv_pmci.c:217) 
==87863== by 0x4035C5: main (mpiexec.c:343) 
==87863== 
==87863== LEAK SUMMARY: 
==87863== definitely lost: 0 bytes in 0 blocks 
==87863== indirectly lost: 0 bytes in 0 blocks 
==87863==  possibly lost: 0 bytes in 0 blocks 
==87863== still reachable: 131,120 bytes in 2 blocks 
==87863==   suppressed: 0 bytes in 0 blocks 
==87863== 
==87863== For counts of detected and suppressed errors, rerun with: -v 
==87863== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 
+0

有關問題的鏈接:[link 1](https://github.com/open-mpi/ompi/issues/2166)和[link 2](http://stackoverflow.com/questions/11218056/can -someone-explain-this-valgrind-error-with-open-mpi) –

+0

使用MPICH版本:Valgrind [輸出](https://drive.google.com/file/d/0B871wCRylUoWQlMtTVI5WV9OWTg/view?usp=sharing) 3.2。 –

+0

[Link](https://drive.google.com/file/d/0B871wCRylUoWT2RvV0ZuZ3ZzVFk/view?usp=sharing)to source –

回答

0

這些輸出指向MPI庫中的某些內存泄漏,而不是您的應用程序代碼。你可以放心地忽略它們。

更具體地說,這些泄漏來自發射器。 ORTE是OpenMPI的運行時環境,負責啓動和管理MPI進程。 Hydra是MPICH的啓動器和進程管理器。

0

術語「絕對丟失「意味着你的程序在第13行的主要功能(就我所見在輸出中)而言,它直接泄漏內存或調用一些導致內存泄漏的其他函數(orterun)。您必須修復這些泄漏或提供更多的代碼。

看一看here之前的一切。

+0

但是,main.c文件不屬於我的源文件 –

相關問題