我執行一些計算來獲取numpy數組列表。隨後,我想找到沿第一軸的最大值。我目前的實施(見下文)非常緩慢,我想找到替代方案。當應用於數組列表時,Numpy最大速度很慢
原始
pending = [<list of items>]
matrix = [compute(item) for item in pending if <some condition on item>]
dominant = np.max(matrix, axis = 0)
修訂1:此實現更快的(〜10倍;這大概是因爲numpy的並不需要弄清楚陣列的形狀)
pending = [<list of items>]
matrix = [compute(item) for item in pending if <some condition on item>]
matrix = np.vstack(matrix)
dominant = np.max(matrix, axis = 0)
我經歷了幾次測試,並且放緩似乎是由於將陣列列表內部轉換爲一個numpy數組所致
Timer unit: 1e-06 s
Total time: 1.21389 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
4 def direct_max(list_of_arrays):
5 1000 1213886 1213.9 100.0 np.max(list_of_arrays, axis = 0)
Total time: 1.20766 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
8 def numpy_max(list_of_arrays):
9 1000 1151281 1151.3 95.3 list_of_arrays = np.array(list_of_arrays)
10 1000 56384 56.4 4.7 np.max(list_of_arrays, axis = 0)
Total time: 0.15437 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
12 @profile
13 def stack_max(list_of_arrays):
14 1000 102205 102.2 66.2 list_of_arrays = np.vstack(list_of_arrays)
15 1000 52165 52.2 33.8 np.max(list_of_arrays, axis = 0)
有沒有什麼辦法來加速最大函數,或者是否有可能用我的計算結果高效地填充一個numpy數組,以便max最快?
什麼數據類型是'items'? – mgilson 2013-04-10 18:00:56
最快的方法是首先用2d numpy數組而不是數組列表來啓動。如果列表的長度不同,只需使用-inf或nan即可。 – Bitwise 2013-04-10 18:16:45
@mgilson:項目本身是表單的鍵值對(鍵:一些可散列類型,值:numpy數組) – 2013-04-10 18:19:56