2013-09-25 34 views
1

我想將包含字符串的數組轉換爲具有浮點值和字符串的數組。我的代碼目前看起來是這樣的:ValueError嘗試將字符串數組轉換爲混合給定dtypes的數組

datatype1=np.dtype([ 
('LOCATION_THETA',np.float64), 
('LOCATION_PHI',np.float64), 
('ETHETA_MAGN',np.float64), 
('ETHETA_PHASE',np.float64), 
('EPHI_MAGN',np.float64), 
('EPHI_PHASE',np.float64), 
('DIRECTIVITY_VERT',np.float64), 
('DIRECTIVITY_HORIZ',np.float64), 
('DIRECTIVITY_TOTAL',np.float64), 
('POLARISATION_AXIALR',np.float64), 
('POLARISATION_ANGLE',np.float64), 
('POLARISATION_DIRECTION','|S5')]) 

table2=np.array(table,dtype=datatype1) 

隨着table(字符串numpy的陣列)看起來像這樣:

[['0.00' '0.00' '5.751E-01' '-2.08' '9.532E-05' '-86.19' '1.7442' '-73.8670' '1.7442' '0.0002' '0.00' 'RIGHT'] 
['2.00' '0.00' '5.747E-01' '-2.11' '1.291E-04' '-82.47' '1.7390' '-71.2312' '1.7390' '0.0002' '0.00' 'RIGHT'] 
['4.00' '0.00' '5.738E-01' '-2.21' '1.632E-04' '-80.31' '1.7243' '-69.1973' '1.7243' '0.0003' '0.00' 'RIGHT'] 
['6.00' '0.00' '5.722E-01' '-2.38' '1.973E-04' '-78.94' '1.7001' '-67.5479' '1.7001' '0.0003' '0.00' 'RIGHT'] 
['8.00' '0.00' '5.699E-01' '-2.61' '2.314E-04' '-78.02' '1.6663' '-66.1644' '1.6663' '0.0004' '0.01' 'RIGHT'] 
... 

然而,當我執行腳本,我收到以下錯誤:

ValueError: could not convert string to float: RIGHT 

它不應該這樣做,因爲我希望字符串是|S5,不float ...

在此先感謝您的幫助!

+1

它爲我工作。但是我必須在'table'中的值之間插入逗號。這個錯誤表明,也許在'table'的一個或多個列之間缺少一個逗號。 – atomh33ls

回答

1

這裏會發生什麼事是,當你這樣做:

ts = np.array(t, dtype=dt) 

的D型被應用到每個元素table。它對於前11個元素來說工作得很好,然後到達'RIGHT',它不能變成一個整數。下面是它沒有'RIGHT'(這將是亂了!):

>>> t[:2,:-1] 
array([['0.00', '0.00', '5.751E-01', '-2.08', '9.532E-05', '-86.19', '1.7442', '-73.8670', '1.7442', '0.0002', '0.00'], 
     ['2.00', '0.00', '5.747E-01', '-2.11', '1.291E-04', '-82.47', '1.7390', '-71.2312', '1.7390', '0.0002', '0.00']], 
     dtype='|S9') 

>>> np.array(t[:2,:-1], dt) 
array([[(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, '0.00'), 
     (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, '0.00'), 
     (0.5751, 0.5751, 0.5751, 0.5751, 0.5751, 0.5751, 0.5751, 0.5751, 0.5751, 0.5751, 0.5751, '5.751'), 
     (-2.08, -2.08, -2.08, -2.08, -2.08, -2.08, -2.08, -2.08, -2.08, -2.08, -2.08, '-2.08'), 
     (9.532e-05, 9.532e-05, 9.532e-05, 9.532e-05, 9.532e-05, 9.532e-05, 9.532e-05, 9.532e-05, 9.532e-05, 9.532e-05, 9.532e-05, '9.532'), 
     (-86.19, -86.19, -86.19, -86.19, -86.19, -86.19, -86.19, -86.19, -86.19, -86.19, -86.19, '-86.1'), 
     (1.7442, 1.7442, 1.7442, 1.7442, 1.7442, 1.7442, 1.7442, 1.7442, 1.7442, 1.7442, 1.7442, '1.744'), 
     (-73.867, -73.867, -73.867, -73.867, -73.867, -73.867, -73.867, -73.867, -73.867, -73.867, -73.867, '-73.8'), 
     (1.7442, 1.7442, 1.7442, 1.7442, 1.7442, 1.7442, 1.7442, 1.7442, 1.7442, 1.7442, 1.7442, '1.744'), 
     (0.0002, 0.0002, 0.0002, 0.0002, 0.0002, 0.0002, 0.0002, 0.0002, 0.0002, 0.0002, 0.0002, '0.000'), 
     (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, '0.00')], 
     [(2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, '2.00'), 
     (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, '0.00'), 
     (0.5747, 0.5747, 0.5747, 0.5747, 0.5747, 0.5747, 0.5747, 0.5747, 0.5747, 0.5747, 0.5747, '5.747'), 
     (-2.11, -2.11, -2.11, -2.11, -2.11, -2.11, -2.11, -2.11, -2.11, -2.11, -2.11, '-2.11'), 
     (0.0001291, 0.0001291, 0.0001291, 0.0001291, 0.0001291, 0.0001291, 0.0001291, 0.0001291, 0.0001291, 0.0001291, 0.0001291, '1.291'), 
     (-82.47, -82.47, -82.47, -82.47, -82.47, -82.47, -82.47, -82.47, -82.47, -82.47, -82.47, '-82.4'), 
     (1.739, 1.739, 1.739, 1.739, 1.739, 1.739, 1.739, 1.739, 1.739, 1.739, 1.739, '1.739'), 
     (-71.2312, -71.2312, -71.2312, -71.2312, -71.2312, -71.2312, -71.2312, -71.2312, -71.2312, -71.2312, -71.2312, '-71.2'), 
     (1.739, 1.739, 1.739, 1.739, 1.739, 1.739, 1.739, 1.739, 1.739, 1.739, 1.739, '1.739'), 
     (0.0002, 0.0002, 0.0002, 0.0002, 0.0002, 0.0002, 0.0002, 0.0002, 0.0002, 0.0002, 0.0002, '0.000'), 
     (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, '0.00')]], 
     dtype=[('LOCATION_THETA', '<f8'), ('LOCATION_PHI', '<f8'), ('ETHETA_MAGN', '<f8'), ('ETHETA_PHASE', '<f8'), ('EPHI_MAGN', '<f8'), ('EPHI_PHASE', '<f8'), ('DIRECTIVITY_VERT', '<f8'), ('DIRECTIVITY_HORIZ', '<f8'), ('DIRECTIVITY_TOTAL', '<f8'), ('POLARISATION_AXIALR', '<f8'), ('POLARISATION_ANGLE', '<f8'), ('POLARISATION_DIRECTION', 'S5')]) 

所以,你可以看到,每一個元素你會得到一個可愛的小元組(「記錄」)與D型datatype1(它甚至使最後一個字符串爲你)。

有一些方法可以解決這個問題,最好的方法是從頭開始創建/導入帶有正確dtype的數組,以便您永遠不必複製它。對於某些轉換,可以製作一個view,其中將數據解釋爲,就好像它具有新的複雜dtype一樣,但這不會將字符串轉換爲數字,因爲這比假裝數據更復雜一個號碼。

對於您的情況,您應該使用一個比regular structured array稍微複雜的recarray,然後您可以使用fromarrays函數。它需要一列列,每列都有統一的類型,而不是行,因此轉置:

>>> np.rec.fromarrays(t.T, dt) 
rec.array([ (0.0, 0.0, 0.5751, -2.08, 9.532e-05, -86.19, 1.7442, -73.867, 1.7442, 0.0002, 0.0, 'RIGHT'), 
     (2.0, 0.0, 0.5747, -2.11, 0.0001291, -82.47, 1.739, -71.2312, 1.739, 0.0002, 0.0, 'RIGHT'), 
     (4.0, 0.0, 0.5738, -2.21, 0.0001632, -80.31, 1.7243, -69.1973, 1.7243, 0.0003, 0.0, 'RIGHT'), 
     (6.0, 0.0, 0.5722, -2.38, 0.0001973, -78.94, 1.7001, -67.5479, 1.7001, 0.0003, 0.0, 'RIGHT'), 
     (8.0, 0.0, 0.5699, -2.61, 0.0002314, -78.02, 1.6663, -66.1644, 1.6663, 0.0004, 0.01, 'RIGHT')], 
     dtype=[('LOCATION_THETA', '<f8'), ('LOCATION_PHI', '<f8'), ('ETHETA_MAGN', '<f8'), ('ETHETA_PHASE', '<f8'), ('EPHI_MAGN', '<f8'), ('EPHI_PHASE', '<f8'), ('DIRECTIVITY_VERT', '<f8'), ('DIRECTIVITY_HORIZ', '<f8'), ('DIRECTIVITY_TOTAL', '<f8'), ('POLARISATION_AXIALR', '<f8'), ('POLARISATION_ANGLE', '<f8'), ('POLARISATION_DIRECTION', 'S5')]) 

可愛!但是等等,現在是這個rec.array...如果你想保持這種狀態,那就沒問題。如果你想要它是一個regular structured array,做:

>>> np.asarray(np.rec.fromarrays(t.T, dt))