我在numpy數組中有參數集,這些參數集被送入多處理隊列,但在worker中接收時會出現亂碼。這是我的代碼來說明我的問題和問題。將numpy數組放在多處理隊列中遇到困難
import numpy as np
from multiprocessing import Process, Queue
NUMBER_OF_PROCESSES = 2
def worker(input, output):
for args in iter(input.get, 'STOP'):
print('Worker receives: ' + repr(args))
id, par = args
# simulate a complex task, and return result
result = par['A'] * par['B']
output.put((id, result))
# Define parameters to process
parameters = np.array([
(1.0, 2.0),
(3.0, 3.0)], dtype=[('A', 'd'), ('B', 'd')])
# Create queues
task_queue = Queue()
done_queue = Queue()
# Submit tasks
for id, par in enumerate(parameters):
obj = ('id_' + str(id), par)
print('Submitting task: ' + repr(obj))
task_queue.put(obj)
# Start worker processes
for i in range(NUMBER_OF_PROCESSES):
Process(target=worker, args=(task_queue, done_queue)).start()
# Get unordered results
results = {}
for i in range(len(parameters)):
id, result = done_queue.get()
results[id] = result
# Tell child processes to stop
for i in range(NUMBER_OF_PROCESSES):
task_queue.put('STOP')
print('results: ' + str(results))
隨着numpy的1.4.1和Python 2.6.6在64位的CentOS計算機上,我的輸出是:
Submitting task: ('id_0', (1.0, 2.0))
Submitting task: ('id_1', (3.0, 3.0))
Worker receives: ('id_0', (2.07827093387802e-316, 6.9204740511333381e-310))
Worker receives: ('id_1', (0.0, 1.8834810076011668e-316))
results: {'id_0': 0.0, 'id_1': 0.0}
正如所示,與numpy的記錄陣列中的元組在提交任務時狀態良好,但工作人員收到參數時出現亂碼,結果不正確。我在multiprocessing
documentation中讀到「代理方法的參數是可挑選的」。從我所知道的,numpy的陣列是完全picklable:
>>> import pickle
>>> for par in parameters:
... print(pickle.loads(pickle.dumps(par)))
...
(1.0, 2.0)
(3.0, 3.0)
我的問題是,爲什麼參數不正確的工作人員接收到?我怎樣才能傳遞一個numpy記錄數組的行給工人?
看起來像你的問題是一個錯誤,修正了這個提交https://github.com/numpy/numpy/commit/971bab3d51726b95f5afe0c22cbbd7983023f626 –