2013-10-15 116 views
2

我沒有得到我的代碼運行,而作爲其他人,我在理解多處理如何工作方面存在問題。這裏是我到目前爲止的代碼Python多處理的總和

if __name__ == "__main__": 
    start = time.clock() 
    bins = np.linspace(0,5 * 2 ** 15, 2 ** 15, endpoint=False) # 1e3 
    t_full = np.linspace(0, 0.2, 2 * bins.shape[0], endpoint=False) 
    po = Pool() 
    res = po.map_async(timeseries, ((m, n, params, bins, 1, t_full, i, i + 1) for i in xrange(2 ** 15))) 
    signal = sum(res.get()) 

其中時間序列是由

def timeseries_para(m, n, params, bins, seed, t, sum_min, sum_max): 
    np.random.seed(seed) 

    PSD_data = PSD(m, n, params, bins) 
    dataReal = np.empty_like(PSD_data) 

    for i in range(bins.shape[0]): 
     dataReal[i] = np.random.normal(PSD_data[i], 0.1 * PSD_data[i]) 

    plt.loglog(bins, dataReal, 'red') 

    dataCOS = np.sqrt(dataReal) 
    signal = np.zeros(t.shape[0]) 

    ## Calculating timeseries 
    #for i in range(bins.shape[0]): 
    for i in range(sum_min, sum_max): 
     #start = time.clock() 
     signal += dataCOS[i] * np.cos(2 * np.pi * t * bins[i] + random.uniform(0, 2 * np.pi)) 
     #print time.clock() - start   

    return signal 

我的總和從0上升到2 ** 16給出的,所以這加速了是必不可少的。我的問題是,我第一次不知道如何調用我的功能,以及如何將所有我的回覆總結起來。

感謝您的任何建議!

+0

到底是什麼問題?不是多處理提供你正在尋找的收益?或者什麼不工作? –

+0

你的問題可能是'po.map_async'語法,它不支持多個迭代。因此,它將args的單個元組提供給你的函數,但是你的函數需要單獨的參數。改變你的函數定義來接受一個元組:'def timeseries_para((m,n,params,bins,seed,t,sum_min,sum_max))' – askewchan

回答

1

此解決方案,我現在用的是量化的解決方案proposed here以避免Python的循環:

from multiprocessing import Pool 

import numpy as np 

def calc(t_full, w, dataCOS): 
    thetas = np.multiply.outer((2*np.pi*t_full), w) 
    thetas += 2*np.pi*np.random.random(thetas.shape) 

    signal = np.cos(thetas) 
    signal *= dataCOS 

    signal = signal.sum(-1) 

    return signal 

def parallel_calc(w, dataCOS, t_full, processes, num): 
    '''Parallel calculation 

    processes : integer 
     Number of processes, usually one processor for each process 
    num : integer 
     Number of sub-divisions for `w` and `dataCOS` 
     Must be an exact divisor of `len(w)` and `len(dataCOS)` 
    ''' 
    pool = Pool(processes=processes) 
    # 
    results = [] 
    wd = np.vstack((w, dataCOS)) 
    for wd_s in np.split(wd.T, num): 
     w_s = wd_s.T[0] 
     d_s = wd_s.T[1] 
     results.append(pool.apply_async(calc, (t_full, w_s, d_s))) 
    # 
    pool.close() 
    pool.join() 
    return sum((r.get() for r in results)) 

if __name__ == '__main__': 
    w = np.random.random(1000) 
    dataCOS = np.random.random(1000) 
    t_full = np.arange(2**16) 
    # 
    parallel_calc(w, dataCOS, t_full, 4, 10) 
+1

看起來不錯,明天我會試試看 – user2003965

0

好吧,我可以編譯任何hellOkay,我可以編譯任何地獄啊,它的方式更快。不幸的是它給出了不同的結果。

如果我給一個PSD,計算時間序列並將FFT返回到PSD,我會得到不同的東西。

pot = 13 

bins = np.linspace(0,5 * 2**pot,2**pot, endpoint = False) 
t_full = np.linspace(0,0.2,2*bins.shape[0], endpoint = False) 

PSD = 1000/(1000**2 + bins**2) 

plt.loglog(bins, PSD) 

signal = parallel_calc(bins, PSD, t_full, 6, 1024) 

start = time.clock() 
n = signal.size 
timestep = t_full[1] - t_full[0] 

freq = np.fft.fftfreq(n, d=timestep) 
freq = freq[:freq.size/2] 

PSD_from_timeserie = abs(scipy.fftpack.fft(signal)/ n * 2)**2 
PSD_from_timeserie = PSD_from_timeserie[:PSD_from_timeserie.size/2] 

plt.loglog(freq, PSD_from_timeserie, 'x') 

#plt.plot(result) 
plt.show() 

這是我所得到的:enter image description here,它應該怎麼看起來像enter image description here(當然不同的PSD)