2012-02-21 76 views
2

我正在處理分類問題。
我有形狀(604329, 33)ndarray其中有32個特徵和標籤一個柱:Numpy/Scipy:如何重建ndarray?

>>> n_data.shape 
(604329, 33) 

此ndarray的第三列是與01標籤。
我需要將第三列作爲最後一列,以便在需要切片時更容易處理。

問:
有沒有辦法來重建ndarray我們可以將這個第三列作爲最後一列?

回答

2

如果我理解正確的話,你想做的事:

my_array = numpy.roll(my_array,-3,axis=1) 
2

下面將做到這一點:

x = np.hstack((x[:,:3],x[:,4:],x[:,3:4])) 

其中x是你ndarray

2

作爲aix的解決方案的替代方案,您可以直接切片陣列,而不需要hstack

>>> a = numpy.array([range(33) for _ in range(4)]) 
>>> indices = range(33) 
>>> indices.append(indices.pop(3)) 
>>> a[:,indices] 
array([[ 0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
     18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 3], 
     [ 0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
     18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 3], 
     [ 0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
     18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 3], 
     [ 0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
     18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 3]]) 

這對小數組更快一點:

>>> %timeit numpy.hstack((a[:,:3], a[:,4:], a[:, 3:4])) 
100000 loops, best of 3: 19.1 us per loop 
>>> %timeit indices = range(33); indices.append(indices.pop(3)); a[:,indices] 
100000 loops, best of 3: 14 us per loop 

但實際上,對於較大的陣列,它的速度較慢。

>>> a = numpy.array([range(33) for _ in range(600000)]) 
>>> %timeit numpy.hstack((a[:,:3], a[:,4:], a[:, 3:4])) 
1 loops, best of 3: 385 ms per loop 
>>> %timeit indices = range(33); indices.append(indices.pop(3)); a[:,indices] 
1 loops, best of 3: 670 ms per loop 

如果您不需要保留列的順序,(即,如果你可以使用roll)然後Mr. E的解決方案是最快的大型a

>>> %timeit numpy.roll(a, -3, axis=1) 
10 loops, best of 3: 120 ms per loop