我生成了大小爲1000x1000
的100個隨機int矩陣。我正在使用多處理模塊來計算100個矩陣的特徵值。用於計算特徵值的多處理
的代碼如下:
import timeit
import numpy as np
import multiprocessing as mp
def calEigen():
S, U = np.linalg.eigh(a)
def multiprocess(processes):
pool = mp.Pool(processes=processes)
#Start timing here as I don't want to include time taken to initialize the processes
start = timeit.default_timer()
results = [pool.apply_async(calEigen, args=())]
stop = timeit.default_timer()
print (processes":", stop - start)
results = [p.get() for p in results]
results.sort() # to sort the results
if __name__ == "__main__":
global a
a=[]
for i in range(0,100):
a.append(np.random.randint(1,100,size=(1000,1000)))
#Print execution time without multiprocessing
start = timeit.default_timer()
calEigen()
stop = timeit.default_timer()
print stop - start
#With 1 process
multiprocess(1)
#With 2 processes
multiprocess(2)
#With 3 processes
multiprocess(3)
#With 4 processes
multiprocess(4)
輸出是
0.510247945786
('Process:', 1, 5.1021575927734375e-05)
('Process:', 2, 5.698204040527344e-05)
('Process:', 3, 8.320808410644531e-05)
('Process:', 4, 7.200241088867188e-05)
另一次迭代顯示輸出:
69.7296020985
('Process:', 1, 0.0009050369262695312)
('Process:', 2, 0.023727893829345703)
('Process:', 3, 0.0003509521484375)
('Process:', 4, 0.057518959045410156)
我的問題是:
- 爲什麼時間執行時間不會隨着 進程數量的增加而減少?我是否正確使用多處理模塊?
- 我是否正確計算執行時間?
我編輯了下面評論中給出的代碼。我想要串行和多處理函數來查找100個矩陣的相同列表的特徵值。編輯後的代碼是 -
import numpy as np
import time
from multiprocessing import Pool
a=[]
for i in range(0,100):
a.append(np.random.randint(1,100,size=(1000,1000)))
def serial(z):
result = []
start_time = time.time()
for i in range(0,100):
result.append(np.linalg.eigh(z[i])) #calculate eigen values and append to result list
end_time = time.time()
print("Single process took :", end_time - start_time, "seconds")
def caleigen(c):
result = []
result.append(np.linalg.eigh(c)) #calculate eigenvalues and append to result list
return result
def mp(x):
start_time = time.time()
with Pool(processes=x) as pool: # start a pool of 4 workers
result = pool.map_async(caleigen,a) # distribute work to workers
result = result.get() # collect result from MapResult object
end_time = time.time()
print("Mutltiprocessing took:", end_time - start_time, "seconds")
if __name__ == "__main__":
serial(a)
mp(1,a)
mp(2,a)
mp(3,a)
mp(4,a)
隨着進程數量的增加,時間不會減少。我哪裏錯了?多處理將列表劃分爲進程的塊還是必須進行劃分?
您沒有分配工作並將其分發到您的流程。所以沒有合作。它更像每個進程自己進行完整計算,而當有更多進程同時執行相同的事情時,由於CPU負載更多,而不是一個進程正在執行完整的計算。如果你分工並分發給你的工作人員,它應該更快。 – dopstar