我有一些麻煩[可能]關閉我的解析器中的進程池。當所有任務完成後,它掛起並且什麼都不做,CPU使用率大約爲1%。Python多處理map_async掛起
profiles_pool = multiprocessing.Pool(processes=4)
pages_pool = multiprocessing.Pool(processes=4)
m = multiprocessing.Manager()
pages = m.list(['URL'])
pages_done = m.list()
while True:
# grab all links
res = pages_pool.imap_unordered(deco_process_people, pages, chunksize=1)
pages_done += pages
pages = []
for new_users,new_pages in res:
users.update(new_users)
profile_tasks = [ (new_users[i]['link'],i) for i in new_users ]
# enqueue grabbed links for parsing
profiles_pool.map_async(deco_process_profiles,
profile_tasks, chunksize=2,
callback=profile_update_callback)
# i dont need a result of map_async actually
# callback will apply parsed data to users dict
# users dict is an instance of Manager.dict()
for p in new_pages:
if p not in pages_done and p not in pages:
pages.append(p)
# we need more than 900 pages to be parsed for bug occurrence
#if len(pages) == 0:
if len(pages_done) > 900:
break
#
# closing other pools
#
# ---- the last printed string:
print 'Closing profiles pool',
sys.stdout.flush()
profiles_pool.close()
profiles_pool.join()
print 'closed'
我想這個問題是在池隊列錯誤打開任務計算,但我不舒爾不能檢查這一點 - IDK的如何獲得任務隊列長度。
它可能是什麼,以及首先看什麼?
另外,我注意到:字符串,掛起是'profiles_pool.join()'。 – 2012-03-19 14:42:54