0

我有一個python腳本正在運行,它在多個線程中啓動相同的函數。這些函數創建並處理2個計數器(c1和c2)。來自分叉進程的所有c1計數器的結果應該合併在一起。與所有c2計數器的結果相同,由不同的叉子返回。在多處理/映射函數中返回計數器對象

我的(僞)碼的樣子說:

def countIt(cfg) 
    c1 = Counter 
    c2 = Counter 
    #do some things and fill the counters by counting words in an text, like 
    #c1= Counter({'apple': 3, 'banana': 0}) 
    #c2= Counter({'blue': 3, 'green': 0})  

    return c1, c2 

if __name__ == '__main__': 
     cP1 = Counter() 
     cP2 = Counter() 
     cfg = "myConfig" 
     p = multiprocessing.Pool(4) #creating 4 forks 
     c1, c2 = p.map(countIt,cfg)[:2] 
     # 1.) This will only work with [:2] which seams to be no good idea 
     # 2.) at this point c1 and c2 are lists, not a counter anymore, 
     # so the following will not work: 
     cP1 + c1 
     cP2 + c2 

按照上面的例子中,我需要一個像結果: CP1 =計數器({ '蘋果':25, '香蕉':247, 'orange':24}) cP2 = Counter({'red':11,'blue':56,'green':3})

所以我的問題:我該如何計算事物洞察分叉過程爲了彙總父進程中的每個計數器(全部是c1和全部c2)?

+0

@mattm這是行不通的,因爲'總和()'不會返回櫃檯?以下錯誤發生:'TypeError:不支持的操作數類型爲+:'int'和'Counter'' –

+1

至少這行肯定是一個錯誤:'c1,c2 = p.map(countIt,cfg)[ :2]'。你可以看到如何處理swenzel的答案的結果。 – KobeJohn

回答

2

您需要使用例如for-each循環來「解壓縮」您的結果。您將收到一個元組列表,其中每個元組都是一對計數器:(c1, c2)
你現在的解決方案實際上是混合起來的。您將[(c1a, c2a), (c1b, c2b)]指定爲c1, c2,這意味着c1包含(c1a, c2a)c2包含(c1b, c2b)

試試這個:

if __name__ == '__main__': 
     from contextlib import closing 

     cP1 = Counter() 
     cP2 = Counter() 

     # I hope you have an actual list of configs here, otherwise map will 
     # will call `countIt` with the single characters of the string 'myConfig' 
     cfg = "myConfig" 

     # `contextlib.closing` makes sure the pool is closed after we're done. 
     # In python3, Pool is itself a contextmanager and you don't need to 
     # surround it with `closing` in order to be able to use it in the `with` 
     # construct. 
     # This approach, however, is compatible with both python2 and python3. 
     with closing(multiprocessing.Pool(4)) as p: 
      # Just counting, no need to order the results. 
      # This might actually be a bit faster. 
      for c1, c2 in p.imap_unordered(countIt, cfg): 
       cP1 += c1 
       cP2 += c2 
+0

不是OP,但是感謝使用關閉上下文管理器來改進代碼。我之前沒有看到它,也許是因爲我還沒有在py3中使用mp。 – KobeJohn

+0

謝謝,這是有效的。在我發現之前的幾分鐘,python會建立一個所有分支結果的列表,比如'[(counter(),counter()),(counter(),counter()),....]'' 。所以你的回答恰好符合這一點。謝謝。使用'closing'是絕對新的,但很有趣! :-) –