Python - 多處理matplotlib griddata

繼我以前的問題[1]，我想申請多處理matplotlib的griddata函數。是否可以將網格數據分成4部分，每部分4個核心？我需要這個來提高性能。Python - 多處理matplotlib griddata

例如，嘗試下面的代碼，以不同的價值觀經歷過了size：

import numpy as np 
import matplotlib.mlab as mlab 
import time 

size = 500 

Y = np.arange(size) 
X = np.arange(size) 
x, y = np.meshgrid(X, Y) 
u = x * np.sin(5) + y * np.cos(5) 
v = x * np.cos(5) + y * np.sin(5) 
test = x + y 

tic = time.clock() 

test_d = mlab.griddata(
    x.flatten(), y.flatten(), test.flatten(), x+u, y+v, interp='linear') 

toc = time.clock() 

print 'Time=', toc-tic

來源

2015-04-25 user3601754

我不認爲你可以應用多處理。也許，這個問題http://stackoverflow.com/q/7424777/566035有幫助嗎？ – otterb

示例代碼在語法上不正確。你打算如何處理以下行： 'test = xx + yy' – OYRM

我修復了代碼，現在應該運行。 –

我跑到下面的示例代碼在Python 3.4.2，與numpy的版本1.9.1和matplotlib 1.4.2版，在MacBook Pro上有4個物理CPU（即相對於「虛擬」的CPU，這在Mac硬件架構也使得可用於某些用例）：

import numpy as np 
import matplotlib.mlab as mlab 
import time 
import multiprocessing 

# This value should be set much larger than nprocs, defined later below 
size = 500 

Y = np.arange(size) 
X = np.arange(size) 
x, y = np.meshgrid(X, Y) 
u = x * np.sin(5) + y * np.cos(5) 
v = x * np.cos(5) + y * np.sin(5) 
test = x + y 

tic = time.clock() 

test_d = mlab.griddata(
    x.flatten(), y.flatten(), test.flatten(), x+u, y+v, interp='linear') 

toc = time.clock() 

print('Single Processor Time={0}'.format(toc-tic)) 

# Put interpolation points into a single array so that we can slice it easily 
xi = x + u 
yi = y + v 
# My example test machine has 4 physical CPUs 
nprocs = 4 
jump = int(size/nprocs) 

# Enclose the griddata function in a wrapper which will communicate its 
# output result back to the calling process via a Queue 
def wrapper(x, y, z, xi, yi, q): 
    test_w = mlab.griddata(x, y, z, xi, yi, interp='linear') 
    q.put(test_w) 

# Measure the elapsed time for multiprocessing separately 
ticm = time.clock() 

queue, process = [], [] 
for n in range(nprocs): 
    queue.append(multiprocessing.Queue()) 
    # Handle the possibility that size is not evenly divisible by nprocs 
    if n == (nprocs-1): 
     finalidx = size 
    else: 
     finalidx = (n + 1) * jump 
    # Define the arguments, dividing the interpolation variables into 
    # nprocs roughly evenly sized slices 
    argtuple = (x.flatten(), y.flatten(), test.flatten(), 
       xi[:,(n*jump):finalidx], yi[:,(n*jump):finalidx], queue[-1]) 
    # Create the processes, and launch them 
    process.append(multiprocessing.Process(target=wrapper, args=argtuple)) 
    process[-1].start() 

# Initialize an array to hold the return value, and make sure that it is 
# null-valued but of the appropriate size 
test_m = np.asarray([[] for s in range(size)]) 
# Read the individual results back from the queues and concatenate them 
# into the return array 
for q, p in zip(queue, process): 
    test_m = np.concatenate((test_m, q.get()), axis=1) 
    p.join() 

tocm = time.clock() 

print('Multiprocessing Time={0}'.format(tocm-ticm)) 

# Check that the result of both methods is actually the same; should raise 
# an AssertionError exception if assertion is not True 
assert np.all(test_d == test_m)

，我得到了以下結果：

/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/matplotlib/tri/triangulation.py:110: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future.self._neighbors) 
Single Processor Time=8.495998 
Multiprocessing Time=2.249938

我不確定是什麼導致了triangulation.py的「未來警告」（顯然，我的matplotlib版本不喜歡關於最初爲該問題提供的輸入值的內容），但無論如何，多處理~~似乎確實達到了8.50/2.25 = 3.8~~，（編輯：請參閱註釋）所需的加速比，這大約是我們對具有4個CPU的機器的約4倍的附近。最後，斷言聲明也成功執行，證明兩種方法得到相同的答案，所以儘管有些奇怪的警告信息，但我相信上面的代碼是一個有效的解決方案。

編輯：一個評論者指出，無論是我的解決方案，以及代碼片段張貼原作者，很可能使用了錯誤的方法，time.clock()，用於測量執行時間;他建議使用time.time()。我想我也正在接近他的觀點。（進一步挖掘Python文檔，我仍然不相信即使這個解決方案是100％正確的，因爲更新版本的Python似乎不贊成使用time.clock()而使用time.perf_counter()和time.process_time()。但不管我是否同意，是否或者不是time.time()絕對是採取這種測量的最正確的方式，它仍然可能比我以前使用的更正確，time.clock()。）

假設評論者的觀點是正確的，那麼它意味着我的約4倍加速我以爲我測量過的其實是錯誤的。

但是，這並不意味着底層代碼本身沒有正確的並行化;相反，這只是意味着並行在這種情況下實際上並沒有幫助;拆分數據並在多個處理器上運行並沒有改善任何事情。爲什麼會這樣？其他用戶至少在numpy/scipy中有一些功能運行在多核上，有些功能不運行，對於最終用戶來說，它可能是一個嚴峻挑戰性的研究項目，試圖找出哪些是哪個。

基於這個實驗的結果，如果我的解決方案在Python中正確地實現了並行化，但沒有觀察到進一步的加速，那麼我會建議最簡單的可能解釋是matplotlib也可能並行化一些函數引擎蓋「，可以這麼說，在編譯的C++庫中，就像numpy/scipy已經那樣。假設情況如此，那麼這個問題的正確答案就是沒有什麼可以做的了：如果底層C++庫已經在多核心上靜默運行，那麼在Python中的進一步並行化就沒有用處。

來源

2015-05-01 16:54:21 stachyra

不幸的是，您不是使用''time.clock（）''計算掛鐘時間（請參閱http://stackoverflow.com/a/23325328/1510289）。相反，使用''time.time（）''注意多處理場景需要更長的時間。雖然這是一個不錯的嘗試！我也嘗試自己分割輸入值，發現沒有加速到''griddata（）''無論如何。（ –

對不起，但@ stachyra的回答是不正確的。用''time.time（）''替換''time.clock（''''''真正的掛鐘性能會更糟，我的8-CPU機器給出：''Single Processor時間= 8.833多處理時間= 11.677'' –

我無法啓動它...我得到一個錯誤：「Traceback（最近調用最後）：文件」/usr/lib/python2.7/multiprocessing/process.py「，第258行，在_bootstrap中 self._target（* self._args，** self._kwargs）文件「」，第11行，包裝中 test_w = mlab.griddata（x，y ，z，xi，yi，interp ='linear'）文件「/usr/lib/pymodules/python2.7/matplotlib/mlab.py」，第2619行，griddata raise ValueError（「輸出格必須有恆定的間距「 ValueError：輸出網格必須具有常量sp當使用INTERP ='線性'...'「 – user3601754

Python - 多處理matplotlib griddata

回答

相關問題