如何使用MPI_Comm_spawn啓動遠程節點上的工作進程?遠程節點上的mpi_comm_spawn
使用的openmpi 1.4.3,我試過這段代碼:
MPI_Info info;
MPI_Info_create(&info);
MPI_Info_set(info, "host", "node2");
MPI_Comm intercom;
MPI_Comm_spawn("worker",
MPI_ARGV_NULL,
nprocs,
info,
0,
MPI_COMM_SELF,
&intercom,
MPI_ERRCODES_IGNORE);
但失敗與此錯誤消息:
-------------------------------------------------------------------------- There are no allocated resources for the application worker that match the requested mapping: Verify that you have mapped the allocated resources properly using the --host or --hostfile specification. -------------------------------------------------------------------------- -------------------------------------------------------------------------- A daemon (pid unknown) died unexpectedly on signal 1 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. --------------------------------------------------------------------------
如果我取代 「節點2」 有名稱我的本地機器,然後它工作正常。如果我ssh進入node2並在那裏運行相同的東西(在info字典中使用「node2」),那麼它也可以正常工作。
我不想用mpirun啓動父進程,所以我只是尋找一種方法來動態生成遠程節點上的進程。這可能嗎?
謝謝。我想避免mpirun的原因是我正在寫一個MATLAB mex文件來卸載一些計算。所以我只有一個MATLAB爲我調用的C文件,這意味着主機名需要以編程方式進行指定。我想這意味着我必須以某種方式從我的mex文件的新進程中調用mpirun? – krashalot 2010-11-24 00:25:59