我有以下返回不同的結構化數據:NumPy的的genfromtxt取決於D型參數
from numpy import genfromtxt
seg_data1 = genfromtxt('./datasets/segmentation.all', delimiter=',', dtype="|S5")
seg_data2 = genfromtxt('./datasets/segmentation.all', delimiter=',', dtype=["|S5"] + ["float" for n in range(19)])
print seg_data1
print seg_data2
print seg_data1[:,0:1]
print seg_data2[:,0:1]
事實證明,seg_data1
和seg_data2
是不一樣的一種結構。下面是打印:
[['BRICK' '140.0' '125.0' ..., '7.777' '0.545' '-1.12']
['BRICK' '188.0' '133.0' ..., '8.444' '0.538' '-0.92']
['BRICK' '105.0' '139.0' ..., '7.555' '0.532' '-0.96']
...,
['CEMEN' '128.0' '161.0' ..., '10.88' '0.540' '-1.99']
['CEMEN' '150.0' '158.0' ..., '12.22' '0.503' '-1.94']
['CEMEN' '124.0' '162.0' ..., '14.55' '0.479' '-2.02']]
[ ('BRICK', 140.0, 125.0, 9.0, 0.0, 0.0, 0.2777779, 0.06296301, 0.66666675, 0.31111118, 6.185185, 7.3333335, 7.6666665, 3.5555556, 3.4444444, 4.4444447, -7.888889, 7.7777777, 0.5456349, -1.1218182)
('BRICK', 188.0, 133.0, 9.0, 0.0, 0.0, 0.33333334, 0.26666674, 0.5, 0.077777736, 6.6666665, 8.333334, 7.7777777, 3.8888888, 5.0, 3.3333333, -8.333333, 8.444445, 0.53858024, -0.92481726)
('BRICK', 105.0, 139.0, 9.0, 0.0, 0.0, 0.27777782, 0.107407436, 0.83333325, 0.52222216, 6.111111, 7.5555553, 7.2222223, 3.5555556, 4.3333335, 3.3333333, -7.6666665, 7.5555553, 0.5326279, -0.96594584)
...,
('CEMEN', 128.0, 161.0, 9.0, 0.0, 0.0, 0.55555534, 0.25185192, 0.77777785, 0.16296278, 7.148148, 5.5555553, 10.888889, 5.0, -4.7777777, 11.222222, -6.4444447, 10.888889, 0.5409177, -1.9963073)
('CEMEN', 150.0, 158.0, 9.0, 0.0, 0.0, 2.166667, 1.6333338, 1.388889, 0.41851807, 8.444445, 7.0, 12.222222, 6.111111, -4.3333335, 11.333333, -7.0, 12.222222, 0.50308645, -1.9434487)
('CEMEN', 124.0, 162.0, 9.0, 0.11111111, 0.0, 1.3888888, 1.1296295, 2.0, 0.8888891, 10.037037, 8.0, 14.555555, 7.5555553, -6.111111, 13.555555, -7.4444447, 14.555555, 0.4799313, -2.0293121)]
[['BRICK']
['BRICK']
['BRICK']
...,
['CEMEN']
['CEMEN']
['CEMEN']]
Traceback (most recent call last):
File "segmentationdata.py", line 14, in <module>
print seg_data2[:,0:1]
IndexError: too many indices for array
我寧願在seg_data1
形式genfromtxt
返回數據,雖然我不知道任何內置的方式來強制seg_data2
符合該類型。據我所知有沒有簡單的辦法:
seg_target1 = seg_data1[:,0:1]
seg_data1 = seg_data1[:,1:]
seg_data2
。現在我可以做data.astype(float)
但重點是,是不是genfromtxt
應該做的開始,當我給它dtype
數組?
到底是什麼'[「| S5」] + [「浮動」對於範圍內的n(19)]'假設代表dtype? –
我不太明白你想要做什麼。你說你會*'而不是'genfromtxt'以'seg_data1'' *的形式返回數據,那麼'seg_data1'有什麼問題?看起來你可能會把結構化數組中的* fields *與多維數組中的* columns *混淆起來。字段可以有不同的dtype,但列不能。如果你想要一個數據結構,其中「列」可以有不同的dtypes,那麼你可能想使用['pandas.DataFrame'](http://pandas.pydata.org/pandas-docs/stable/generated/pandas。 DataFrame.html)。 –
'panda.DataFrame'是否使用結構化數組存儲其數據?或者'dtype = object'數組?或者取決於什麼方便? – hpaulj