對於2D-array
:
a[~np.all(np.isnan(a),axis=1)]
對於結構化陣列(recarray),你可以這樣做:
def remove_nan(a, split=True):
cols = [i[0] for i in eval(str(a.dtype))]
col = cols[0]
test = ~np.isnan(a[col])
if not split:
new_len = len(a[col][test])
new = np.empty((new_len,), dtype=a.dtype)
for col in cols:
new[col] = a[col][~np.isnan(a[col])]
return new
else:
indices = [i for i in xrange(len(a)-1) if test[i+1]!=test[i]]
return [i for i in np.split(a, indices) if not np.isnan(i[col][0])]
,僅保留線而沒有nan
使用split=False
。例如:
a = np.array([(1,2),(2,2),(nan,nan),(nan,nan),(4,4),(4,3)], dtype=[('test',float),('col2',float)])
remove_nan(a)
#[array([(1.0, 2.0), (2.0, 2.0)],
# dtype=[('test', '<f8'), ('col2', '<f8')]),
# array([(4.0, 4.0), (4.0, 3.0)],
# dtype=[('test', '<f8'), ('col2', '<f8')])]
退房熊貓,它是非常適合這種事情:http://pandas.pydata.org/pandas-docs/dev/missing_data.html – YXD
謝謝我要看看這個。 – ala