2016-04-20 48 views
0

我終於明白瞭如何用蒔蘿代替鹹菜從以下討論:pickle-dill。 例如,下面的代碼爲我工作用蒔蘿和pymongo替代鹹菜

import os 
import dill 
import multiprocessing 

def run_dill_encoded(what): 
    fun, args = dill.loads(what) 
    return fun(*args) 

def apply_async(pool, fun, args): 
    return pool.apply_async(run_dill_encoded, (dill.dumps((fun, args)),)) 

if __name__ == '__main__': 

    pool = multiprocessing.Pool(5) 
    results = [apply_async(pool, lambda x: x*x, args=(x,)) for x in range(1,7)] 
    output = [p.get() for p in results] 
    print(output) 

我試圖採用同樣的理念,以pymongo。下面的代碼

import os 
import dill 
import multiprocessing 
import pymongo 

def run_dill_encoded(what): 
    fun, args = dill.loads(what) 
    return fun(*args) 


def apply_async(pool, fun, args): 
    return pool.apply_async(run_dill_encoded, (dill.dumps((fun, args)),)) 


def write_to_db(value_to_insert):   
    client = pymongo.MongoClient('localhost', 27017) 
    db = client['somedb'] 
    collection = db['somecollection'] 
    result = collection.insert_one({"filed1": value_to_insert}) 
    client.close() 

if __name__ == '__main__': 
    pool = multiprocessing.Pool(5) 
    results = [apply_async(pool, write_to_db, args=(x,)) for x in ['one', 'two', 'three']] 
    output = [p.get() for p in results] 
    print(output) 

產生錯誤:

multiprocessing.pool.RemoteTraceback: 
""" 
Traceback (most recent call last): 
    File "C:\Python34\lib\multiprocessing\pool.py", line 119, in worker 
    result = (True, func(*args, **kwds)) 
    File "C:\...\temp2.py", line 10, in run_dill_encoded 
    return fun(*args) 
    File "C:\...\temp2.py", line 21, in write_to_db 
    client = pymongo.MongoClient('localhost', 27017) 
NameError: name 'pymongo' is not defined 
""" 

The above exception was the direct cause of the following exception: 

Traceback (most recent call last): 
    File "C:/.../temp2.py", line 32, in <module> 
    output = [p.get() for p in results] 
    File "C:/.../temp2.py", line 32, in <listcomp> 
    output = [p.get() for p in results] 
    File "C:\Python34\lib\multiprocessing\pool.py", line 599, in get 
    raise self._value 
NameError: name 'pymongo' is not defined 

Process finished with exit code 1 

有什麼不對?

+1

嗨,我是'蒔蘿'作者。看起來你沒有在你的函數中定義'pymongo'。試着把'import pymongo'放在'write_to_db'裏面。如果確保函數中使用的所有變量都是本地定義的,則該函數將序列化得更好(或者根本就是有時)。 –

+1

另外,在'multiprocessing'中有一個更簡單的方法來使用'dill'。嘗試使用'multiprocess'模塊 - 它是'multiprocessing',但'pickle'替換爲'dill'。 –

+0

@MikeMcKerns,非常感謝!有效。我仍在編譯Python 3.x的''multiprocess''。順便說一下,它有類似於'apply_async'的線程嗎? – user1700890

回答

1

正如我在評論中提到的那樣,您需要在功能write_to_db內部放置一個import pymongo。這是因爲當函數被序列化時,它在運送到其他進程空間時不會帶有任何全局引用。