假設我有以下numpy的結構數組:轉換numpy的結構數組子集numpy的陣列,而不復制
In [250]: x
Out[250]:
array([(22, 2, -1000000000, 2000), (22, 2, 400, 2000),
(22, 2, 804846, 2000), (44, 2, 800, 4000), (55, 5, 900, 5000),
(55, 5, 1000, 5000), (55, 5, 8900, 5000), (55, 5, 11400, 5000),
(33, 3, 14500, 3000), (33, 3, 40550, 3000), (33, 3, 40990, 3000),
(33, 3, 44400, 3000)],
dtype=[('f1', '<i4'), ('f2', '<f4'), ('f3', '<f4'), ('f4', '<i4')])
我想修改上述陣列的一個子集到正規numpy的陣列。 對於我的應用程序來說,不需要創建副本(僅適用於視圖)。
字段從上述結構的陣列通過使用下面的函數檢索到:
def fields_view(array, fields):
return array.getfield(numpy.dtype(
{name: array.dtype.fields[name] for name in fields}
))
如果我對字段「F2」和「F3」,我會執行以下操作:
In [251]: y=fields_view(x,['f2','f3'])
In [252]: y
Out [252]:
array([(2.0, -1000000000.0), (2.0, 400.0), (2.0, 804846.0), (2.0, 800.0),
(5.0, 900.0), (5.0, 1000.0), (5.0, 8900.0), (5.0, 11400.0),
(3.0, 14500.0), (3.0, 40550.0), (3.0, 40990.0), (3.0, 44400.0)],
dtype={'names':['f2','f3'], 'formats':['<f4','<f4'], 'offsets':[4,8], 'itemsize':12})
有一種方法可以直接從原始結構化數組的'f2'和'f3'字段獲得一個ndarray。但是,對於我的應用程序,有必要構建這個中間結構化數組,因爲此數據子集是類的一個屬性。
我無法將中間結構化數組轉換爲常規numpy數組而不做副本。
In [253]: y.view(('<f4', len(y.dtype.names)))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-54-f8fc3a40fd1b> in <module>()
----> 1 y.view(('<f4', len(y.dtype.names)))
ValueError: new type not compatible with array.
此功能還可以用來記錄數組轉換爲ndarray:
def recarr_to_ndarr(x,typ):
fields = x.dtype.names
shape = x.shape + (len(fields),)
offsets = [x.dtype.fields[name][1] for name in fields]
assert not any(np.diff(offsets, n=2))
strides = x.strides + (offsets[1] - offsets[0],)
y = np.ndarray(shape=shape, dtype=typ, buffer=x,
offset=offsets[0], strides=strides)
return y
不過,我得到以下錯誤:
In [254]: recarr_to_ndarr(y,'<f4')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-65-2ebda2a39e9f> in <module>()
----> 1 recarr_to_ndarr(y,'<f4')
<ipython-input-62-8a9eea8e7512> in recarr_to_ndarr(x, typ)
8 strides = x.strides + (offsets[1] - offsets[0],)
9 y = np.ndarray(shape=shape, dtype=typ, buffer=x,
---> 10 offset=offsets[0], strides=strides)
11 return y
12
TypeError: expected a single-segment buffer object
功能工作正常,如果我創建副本:
In [255]: recarr_to_ndarr(np.array(y),'<f4')
Out[255]:
array([[ 2.00000000e+00, -1.00000000e+09],
[ 2.00000000e+00, 4.00000000e+02],
[ 2.00000000e+00, 8.04846000e+05],
[ 2.00000000e+00, 8.00000000e+02],
[ 5.00000000e+00, 9.00000000e+02],
[ 5.00000000e+00, 1.00000000e+03],
[ 5.00000000e+00, 8.90000000e+03],
[ 5.00000000e+00, 1.14000000e+04],
[ 3.00000000e+00, 1.45000000e+04],
[ 3.00000000e+00, 4.05500000e+04],
[ 3.00000000e+00, 4.09900000e+04],
[ 3.00000000e+00, 4.44000000e+04]], dtype=float32)
兩個陣列似乎沒有區別:
In [66]: y
Out[66]:
array([(2.0, -1000000000.0), (2.0, 400.0), (2.0, 804846.0), (2.0, 800.0),
(5.0, 900.0), (5.0, 1000.0), (5.0, 8900.0), (5.0, 11400.0),
(3.0, 14500.0), (3.0, 40550.0), (3.0, 40990.0), (3.0, 44400.0)],
dtype={'names':['f2','f3'], 'formats':['<f4','<f4'], 'offsets':[4,8], 'itemsize':12})
In [67]: np.array(y)
Out[67]:
array([(2.0, -1000000000.0), (2.0, 400.0), (2.0, 804846.0), (2.0, 800.0),
(5.0, 900.0), (5.0, 1000.0), (5.0, 8900.0), (5.0, 11400.0),
(3.0, 14500.0), (3.0, 40550.0), (3.0, 40990.0), (3.0, 44400.0)],
dtype={'names':['f2','f3'], 'formats':['<f4','<f4'], 'offsets':[4,8], 'itemsize':12})
非常感謝你爲這個詳細的解答!它幫助我瞭解如何解決我的問題,查看更新後的帖子。 – snowleopard