好吧,所以我想在樹狀結構中進行多線程深度優先搜索。我正在使用羣集中多臺計算機的線程(本例爲本地主機四核和覆盆子pi 2)。主線程應該啓動進程,並在樹的第一次分裂中,對於分裂的每個節點,它應該產生一個新線程。這些線程應該能夠將他們的發現報告給主人。MPI在Python中使用動態生成的深度優先搜索
我試圖動態地做到這一點,而不是提供mpiexec與一些線程,因爲我不知道什麼樹將事先看起來像(例如可能有2或9分裂)。
我從我正在爲這個問題工作的項目中做了一個示例,我按如下方式工作。它從一串數字中取出一位數字,併爲每個數字產生一個線索並將數字發送到該線索。
對於主機:
#!/usr/bin/python
from mpi4py import MPI
import datetime, sys, numpy, time
################ Set up MPI variables ################
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
name = MPI.Get_processor_name()
status = MPI.Status()
################ Master code ################
script = 'cpi.py'
for d in '34':
try:
print 'Trying to spawn child process...'
icomm = MPI.COMM_SELF.Spawn(sys.executable, args=[script], maxprocs=1, root=0)
spawnrank = icomm.Get_rank()
icomm.send(d, dest=spawnrank, tag=11)
print 'Spawned rank %d.' % spawnrank
except: ValueError('Spawn failed to start.')
solved = False
while solved == False:
#while not comm.Iprobe(source=MPI.ANY_SOURCE, tag=MPI.ANY_TAG):
# print 'spawns doing some work...'
# time.sleep(1)
solved = comm.recv(source=MPI.ANY_SOURCE, tag=22)
print 'received solution: %d' % solved
它正確地產卵的工人,他們得到的數字,但不要把它回主。對工人的代碼如下:
工人
#!/usr/bin/python
from mpi4py import MPI
import datetime, sys, numpy
################ Set up MPI variables ################
icomm = MPI.Comm.Get_parent()
comm = MPI.COMM_WORLD
irank = comm.Get_rank()
rank = comm.Get_rank()
running = True
while running:
data = None
data = icomm.recv(source=0, tag=11)
if data:
print 'Trying to send %s from worker rank %d to %d' % (data, rank, irank)
icomm.send(data, dest=0, tag=22)
break
print 'Worker on rank %d done.' % rank
icomm.Disconnect()
它永遠不會到達主代碼的最後一行。我還在主代碼中添加了(註釋掉)一個探測器,以檢查標記爲22的消息是否掛在某處,排除了recv函數中的錯誤,但探測器從未找到該消息。所以我認爲它永遠不會被髮送。
我想通過打印這兩個進程的排名他們都使用排名0這是有道理的,因爲它們是在同一臺計算機上產生的。但後來當我添加一個HOSTFILE和rankfile,試圖迫使它使用不同的計算機的奴隸,它給了我下面的錯誤:
[hch-K55A:06917] *** Process received signal ***
[hch-K55A:06917] Signal: Segmentation fault (11)
[hch-K55A:06917] Signal code: Address not mapped (1)
[hch-K55A:06917] Failing at address: 0x3c
[hch-K55A:06917] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7f2c0d864340]
[hch-K55A:06917] [ 1] /usr/lib/openmpi/lib/openmpi/mca_rmaps_rank_file.so(orte_rmaps_rank_file_lex+0x4a0) [0x7f2c0abdcb70]
[hch-K55A:06917] [ 2] /usr/lib/openmpi/lib/openmpi/mca_rmaps_rank_file.so(+0x23ac) [0x7f2c0abda3ac]
[hch-K55A:06917] [ 3] /usr/lib/libopen-rte.so.4(orte_rmaps_base_map_job+0x2e) [0x7f2c0dacd05e]
[hch-K55A:06917] [ 4] /usr/lib/libopen-rte.so.4(orte_plm_base_setup_job+0x5a) [0x7f2c0dac580a]
[hch-K55A:06917] [ 5] /usr/lib/openmpi/lib/openmpi/mca_plm_rsh.so(orte_plm_rsh_launch+0x338) [0x7f2c0b80a8c8]
[hch-K55A:06917] [ 6] /usr/lib/libopen-rte.so.4(+0x51ff4) [0x7f2c0dac3ff4]
[hch-K55A:06917] [ 7] /usr/lib/libopen-rte.so.4(opal_event_base_loop+0x31e) [0x7f2c0dae9cfe]
[hch-K55A:06917] [ 8] mpiexec() [0x4047d3]
[hch-K55A:06917] [ 9] mpiexec() [0x40347d]
[hch-K55A:06917] [10] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f2c0d4b0ec5]
[hch-K55A:06917] [11] mpiexec() [0x403399]
[hch-K55A:06917] *** End of error message ***
Segmentation fault (core dumped)
使用的命令:程序mpiexec -np 1 --hostfile HOSTFILE - -rankfile rankfile蟒spawntest.py
HOSTFILE: 本地主機 本地主機插槽= 1 MAX-槽= 4 PI2 @ raspi2槽= 4
Rankfile: 秩0 =本地主機插槽= 1個 秩1 = pi2 @ raspi2 slot = 1-4
所以我的問題如下;我怎樣才能在主機以外的計算機上產生這些線程,同時能夠來回發送數據?
感謝您的精心解答,這開始變得更有意義。 – Enzime