2012-04-07 48 views
2

此代碼顯示了我正在嘗試執行的操作的結構。創建對象的副本,而不是在新的多處理進程中重新初始化

import multiprocessing 
from foo import really_expensive_to_compute_object 

## Create a really complicated object that is *hard* to initialise. 
T = really_expensive_to_compute_object(10) 

def f(x): 
    return T.cheap_calculation(x) 

P = multiprocessing.Pool(processes=64) 
results = P.map(f, range(1000000)) 

print results 

的問題是,每一個過程花費了大量的時間重新計算t鍵的使用被計算一次的原創T開始。有沒有辦法來防止這種情況? T有一個快速(深)的複製方法,所以我可以讓Python使用它來代替重新計算?

回答

1

爲什麼不有f參數T而不是引用全局,並自己做副本?

import multiprocessing, copy 
from foo import really_expensive_to_compute_object 

## Create a really complicated object that is *hard* to initialise. 
T = really_expensive_to_compute_object(10) 

def f(t, x): 
    return t.cheap_calculation(x) 

P = multiprocessing.Pool(processes=64) 
results = P.map(f, (copy.deepcopy(T) for _ in range(1000000)), range(1000000)) 

print results 
2

multiprocessing文檔suggests

明確地傳遞資源,子進程

所以,你的代碼可以rewritenn到這樣的事情:

import multiprocessing 
import time 
import functools 

class really_expensive_to_compute_object(object): 
    def __init__(self, arg): 
     print 'expensive creation' 
     time.sleep(3) 

    def cheap_calculation(self, x): 
     return x * 2 

def f(T, x): 
    return T.cheap_calculation(x) 

if __name__ == '__main__': 
    ## Create a really complicated object that is *hard* to initialise. 
    T = really_expensive_to_compute_object(10) 
    ## helper, to pass expensive object to function 
    f_helper = functools.partial(f, T) 
    # i've reduced count for tests 
    P = multiprocessing.Pool(processes=4) 
    results = P.map(f_helper, range(100)) 

    print results 
相關問題