熊貓忽略非數值

我有這樣的DF：熊貓忽略非數值

我試圖進入一個新列。如果x> 15000，則該值是A，否則B.如果X是非數字（BBOX-001，摩比-1），那麼它應該在顯示列X的值：

 X  Y 
0 13500  B 
1 13600  B 
2 BBOX-001 BBOX-001 
3 Mobi-1 Mobi-1 
4 15003  A 
5 15004  A

我具有低於此但是如何忽略列X中的非數字值？

df['Y'] = np.where(df['X'] > 15000, 'A', 'B')

來源

2017-04-23 wazzahenry

當df['X']包含數字和字符串的混合，列的D型將是object代替的數字dtype。 df['X']中類似數字的項目可能是整數或浮點數，或者甚至可能是字符串（從您的問題中不清楚）。在這種情況下，許多數字操作（例如df['X'] > 15000）可能會引發錯誤。

爲了治療像數，值號碼，使用pd.to_numeric到列轉換爲數字D型：

In [41]: numeric_X = pd.to_numeric(df['X'], errors='coerce') 
In [43]: numeric_X 
Out[43]: 
0 13500.0 
1 13600.0 
2  NaN 
3  NaN 
4 15003.0 
5 15004.0 
Name: X, dtype: float64

而且你還可以通過測試NaN的識別線狀值：

is_stringlike = np.isnan(numeric_X)

import numpy as np 
import pandas as pd 

df = pd.DataFrame({'X': ['13500', '13600', 'BBOX-001', 'Mobi-1', '15003', '15004']}) 

numeric_X = pd.to_numeric(df['X'], errors='coerce') 
is_stringlike = np.isnan(numeric_X) 
conditions = [numeric_X > 15000, is_stringlike] 
choices = ['A', df['X']] 
df['Y'] = (np.select(conditions, choices, default='B')) 
print(df)

收率

  X   Y 
0  13500   B 
1  13600   B 
2 BBOX-001 BBOX-001 
3 Mobi-1 Mobi-1 
4  15003   A 
5  15004   A

來源

2017-04-23 10:49:43 unutbu

您可以用convert_objects實現自己的目標：

import pandas as pd 
import numpy as np 

df = pd.DataFrame({'X': ['13500', '13600', 'BBOX-001', 'Mobi-1', '15003', '15004']}) 
# Convert only numeric value to put it in comparison 
df['Y'] = np.where(df.X.convert_objects(convert_numeric=True) > 15000, 'A', 'B') 

print (df)

輸出：

  X Y 
0  13500 B 
1  13600 B 
2 BBOX-001 B 
3 Mobi-1 B 
4  15003 A 
5  15004 A

來源

2017-04-23 10:49:11 Serenity

熊貓忽略非數值

回答

相關問題