2016-07-31 29 views
1

我生成numpy的陣列序列如下:如何有效地vstack大序列的numpy數組塊?

def chunker(seq, size): 
    return (seq[pos:pos + size] for pos in range(0, len(seq), size)) 

for i in chunker(X,10000): 
    e = function(i) 
    print('new marix',e) 

new matrix (10000, 3208) 
new matrix (10000, 3208) 
new matrix (10000, 3208) 
new matrix (10000, 3208) 
new matrix (10000, 3208) 
new matrix (10000, 3208) 
new matrix (10000, 3208) 
new matrix (10000, 3208) 
... 
new matrix (10000, 3208) 

我想vstack上述n矩陣中的單獨一個。因此,我試過如下:

X = np.vstack(e) 

然而,當我打印X我又收到:

new matrix (10000, 3208) 
new matrix (10000, 3208) 
new matrix (10000, 3208) 
new matrix (10000, 3208) 
new matrix (10000, 3208) 
new matrix (10000, 3208) 
new matrix (10000, 3208) 
new matrix (10000, 3208) 
... 
new matrix (10000, 3208) 

取而代之的是新的vstacked單個矩陣。任何想法如何vstack這個numpy數組序列?

更新

從傑德沃德的答案我編輯我的代碼如下:

進口numpy的爲NP

def chunker(seq, size): 
    return (seq[pos:pos + size] for pos in range(0, len(seq), size)) 

for (r,i) in enumerate(chunker(X,10000)): 
    e = function(i) 
    print('new matrix',e) 
    X[r,:] = e 

print(X) 
+1

'vstack'的輸入應該是一個匹配最後一個維度的數組列表。 'e'看起來不像那樣。您需要將個人'e'收集到列表中。 – hpaulj

+1

在你的循環中,「e」的形狀是什麼? 'X'的? 'X [R,:]'? – hpaulj

+0

X.shape =(878049,3208),e.shape =(10000,3208),merged [r,:]。shape =(3208,)。核心似乎已經死亡。它會自動重啓。感謝@hpaulj的幫助!我也越來越:'核心似乎已經死亡。它會自動重新啓動.' –

回答

1

的一種方式,雖然可能不是最有效的,是創建列表中您想要堆疊的列表,然後在循環外堆疊一次。

例如:

import numpy as np 

def chunker(seq, size): 
    return (seq[pos:pos + size] for pos in range(0, len(seq), size)) 

# Some fake function (n.b. this is a silly way to reverse a list) 
def function(arr): 
    arr.reverse() 
    return arr 

# Generate fake X 
X = list(range(100)) 

chunks = [] 
for i in chunker(X,10): 
    e = function(i) 
    print('new matrix',e) 
    chunks.append(e) 

merged = np.vstack(chunks) 
print(merged) 

輸出:

 
new matrix [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] 
new matrix [19, 18, 17, 16, 15, 14, 13, 12, 11, 10] 
new matrix [29, 28, 27, 26, 25, 24, 23, 22, 21, 20] 
new matrix [39, 38, 37, 36, 35, 34, 33, 32, 31, 30] 
new matrix [49, 48, 47, 46, 45, 44, 43, 42, 41, 40] 
new matrix [59, 58, 57, 56, 55, 54, 53, 52, 51, 50] 
new matrix [69, 68, 67, 66, 65, 64, 63, 62, 61, 60] 
new matrix [79, 78, 77, 76, 75, 74, 73, 72, 71, 70] 
new matrix [89, 88, 87, 86, 85, 84, 83, 82, 81, 80] 
new matrix [99, 98, 97, 96, 95, 94, 93, 92, 91, 90] 
[[ 9 8 7 6 5 4 3 2 1 0] 
[19 18 17 16 15 14 13 12 11 10] 
[29 28 27 26 25 24 23 22 21 20] 
[39 38 37 36 35 34 33 32 31 30] 
[49 48 47 46 45 44 43 42 41 40] 
[59 58 57 56 55 54 53 52 51 50] 
[69 68 67 66 65 64 63 62 61 60] 
[79 78 77 76 75 74 73 72 71 70] 
[89 88 87 86 85 84 83 82 81 80] 
[99 98 97 96 95 94 93 92 91 90]] 

或者創建中間列表:

merged = np.zeros([0,10]) 
for i in chunker(X,10): 
    e = function(i) 
    print('new matrix',e) 
    merged = np.vstack([merged, e]) 

print(merged) 

但最有效的是初始化numpy的陣列在循環之前,然後在內部設置該數組的行循環。你需要首先計算最後的merged數組的尺寸(這裏我只是將它創建爲10x10矩陣,因爲我知道尺寸)。

merged = np.zeros([10,10]) 
for (r,i) in enumerate(chunker(X,10)): 
    e = function(i) 
    print('new matrix',e) 
    merged[r,:] = e 

print(merged) 
+0

這些都是非常大的數組,是否有更高效的方法呢? –

+1

我增加了兩個附加選項。底部是迄今爲止最高效的。 – jedwards

+0

我得到了這個異常:'ValueError:無法將形狀(100)的輸入數組廣播成形(3208)'如何進行任何想法?...感謝您的幫助! –

相關問題