Python的map_async如何保持結果順序？

我試圖探索Python對py3.3多庫，我注意到一個奇怪的結果在map_async功能，我一直無法解釋。我一直期待從回調中存儲的結果「失序」。也就是說，如果我向工作進程提供大量任務，則有些任務應該先於其他人完成，而不一定按照它們在輸入列表中輸入或存在的順序執行。但是，我得到了與輸入任務完全一致的有序結果集。甚至在故意試圖通過減緩某些過程「破壞」一些過程（這可能會讓其他過程在其之前完成）之後，情況就是如此。Python的map_async如何保持結果順序？

我在calculate函數中有一個打印語句，顯示它們正在按順序計算，但結果仍然是有序的。雖然我不確定我可以信任打印聲明作爲一個很好的指標，事實上計算不按順序。

測試過程（一般實施例）：建立對象，其每一個保存的整數的列表。將對象列表以map_async作爲參數提交，並將函數「calculate」更新爲對象的numValue屬性的平方值。然後「計算」函數返回具有更新值的對象。

一些代碼：

import time 
import multiprocessing 
import random 

class NumberHolder(): 
    def __init__(self,numValue): 
     self.numValue = numValue #Only one attribute 

def calculate(obj): 
    if random.random() >= 0.5: 
     startTime = time.time() 
     timeWaster = [random.random() for x in range(5000000)] #Waste time. 
     endTime = time.time()   #Establish end time 
     print("%d object got stuck in here for %f seconds"%(obj.numValue,endTime-startTime)) 

#Main Process 
if __name__ == '__main__': 
    numbersToSquare = [x for x in range(0,100)]  #I'm 
    taskList = [] 

    for eachNumber in numbersToSquare: 
     taskList.append(NumberHolder(eachNumber)) #Create a list of objects whose numValue is equal to the numbers we want to square 

    results = [] #Where the results will be stored 
    pool = multiprocessing.Pool(processes=(multiprocessing.cpu_count() - 1)) #Don't use all my processing power. 
    r = pool.map_async(calculate, taskList, callback=results.append) #Using fxn "calculate", feed taskList, and values stored in "results" list 
    r.wait()    # Wait on the results from the map_async 

results = results[0] #All of the entries only exist in the first offset 
for eachObject in results:  #Loop through them and show them 
    print(eachObject.numValue)   #If they calc'd "out of order", I'd expect append out of order

我發現這口井書面答覆，這似乎支持map_async可以有結果是「亂序」的理念：multiprocessing.Pool: When to use apply, apply_async or map?。我還查閱了這裏的文檔（http://docs.python.org/3.3/library/multiprocessing.html）。對於map_async，它對此方法說：「...如果指定了回調，那麼它應該是一個可接受單個參數的可調用對象，當結果變爲就緒時，將對其應用回調函數（除非調用失敗）。其處理結果的線程將被阻塞」

我誤解這是如何工作的？任何幫助深表感謝。

來源

2013-10-30 Thomas

感謝@Blender的迴應。那麼，那會讓我回到一個簡單的問題。對於result.append的回調，只有在所有結果都準備好之後纔會執行附加操作？從閱讀文檔中，我認爲它被稱爲每個結果可用 - 「當結果變得準備好回調應用到它」 – Thomas

這是預期的行爲。該文檔說：

返回結果對象的map()方法的變體。

「結果對象」只是一個容納計算結果的容器類。當你調用r.wait()，你等到結果的所有聚集和整理。儘管它不按順序處理任務，但結果仍將按原始順序排列。

如果你想要的結果要得到他們的計算，使用imap_unordered。

來源

2013-10-30 18:15:47 Blender

謝謝 - 這是非常有益的！絕對誤解了這一點;特別是我如何區分個人結果和結果對象。 – Thomas

Python的map_async如何保持結果順序？

回答

相關問題