1
在我用服務器上的openmpi編譯mpi4py後,出現運行時錯誤。Suse系統上的mpi4py編譯錯誤
OS: SuSe
GCC: 4.8.5
OpenMPI: 1.10.1
HDF5: 1.8.11
mpi4py: 2.0.0
Python: 2.7.9
環境設置: 我使用的virtualenv(服務器沒有管理員權限)
(ENV) [email protected]:~/test> echo $PATH
/opt/local/tools/hdf5/hdf5-1.8.11_openmpi-1.10.1_gcc-4.8.5/bin:/opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/bin:/home/username/test/virtualenv-15.0.3/ENV/bin: [other libs ] :/opt/local/bin:/usr/lib64/mpi/gcc/openmpi/bin:/usr/local/bin:/usr/bin:/bin
(ENV) [email protected]:echo $LD_LIBRARY_PATH
/opt/local/tools/hdf5/hdf5-1.8.11_openmpi-1.10.1_gcc-4.8.5/lib:/opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/lib
(ENV) [email protected]:~/test> pip freeze
cycler==0.10.0
Cython==0.24.1
dill==0.2.5
matplotlib==1.5.3
multiprocessing==2.6.2.1
numpy==1.11.1
pyfits==3.4
pyparsing==2.1.9
python-dateutil==2.5.3
pytz==2016.6.1
scipy==0.18.1
six==1.10.0
編譯並安裝mpi4py:
(ENV) [email protected]:~/test> wget https://bitbucket.org/mpi4py/mpi4py/downloads/mpi4py-2.0.0.tar.gz
(ENV) [email protected]:~/test> tar xzvf mpi4py-2.0.0.tar.gz
(ENV) [email protected]:~/test> cd mpi4py-2.0.0/
(ENV) [email protected]:~/test>vim mpi.cfg
在mpi.cfg我加了一個部分爲我的自定義Open MPI:
[mpi]
mpi_dir = /opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5
mpicc = %(mpi_dir)s/bin/mpicc
mpicxx = %(mpi_dir)s/bin/mpicxx
library_dirs = %(mpi_dir)s/lib
runtime_library_dirs = %(library_dirs)s
編譯
(ENV) [email protected]:python setup.py build --mpi=mpi
安裝
(ENV) [email protected]:python setup.py install
首先基本試驗(OK)
(ENV) [email protected]: mpiexec -n 5 python -m mpi4py helloworld
Hello, World! I am process 0 of 5 on servername.
Hello, World! I am process 1 of 5 on servername.
Hello, World! I am process 2 of 5 on servername.
Hello, World! I am process 3 of 5 on servername.
Hello, World! I am process 4 of 5 on servername.
第二基本測試生成錯誤:
(ENV) [email protected]: python
>>>from mpi4py import MPI
--------------------------------------------------------------------------
Error obtaining unique transport key from ORTE orte_precondition_transports not present in the environment).
Local host: servername
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer):
PML add procs failed
--> Returned "Error" (-1) instead of "Success" (0)
-------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[servername:165332] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!
(ENV) [email protected]:~/test/mpi4py-2.0.0>
更新:mpi4py的編譯過程中我得到這個錯誤
checking for library 'lmpe' ...
/opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/bin/mpicc -pthread
-fno-strict-aliasing -fmessage-length=0 -grecord-gcc-switches -fstack-
protector -O2 -Wall -D_FORTIFY_SOURCE=2 -funwind-tables -fasynchronous-
unwind-tables -g -DNDEBUG -fmessage-length=0 -grecord-gcc-switches
-fstack-protector -O2 -Wall -D_FORTIFY_SOURCE=2 -funwind-tables
-fasynchronous-unwind tables -g -DOPENSSL_LOAD_CONF -fPIC -I/opt/local
/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/include -c _configtest.c -o
_configtest.o
/opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/bin/mpicc -pthread _configtest.o -L/opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/lib -Wl,-R/opt/local/mpi/openmpi/openmpi-1.10.1_gcc-4.8.5/lib -llmpe -o _configtest
/usr/lib64/gcc/x86_64-suse-linux/4.8/../../../../x86_64-suse-linux
bin/ld: cannot find -llmpe
collect2: error: ld returned 1 exit status
failure.
更新請參閱:https://bitbucket.org/mpi4py/mpi4py/issues/52/mpi4py-compilation-error –