2017-03-18 14 views
5

使用普查數據,我想用這兩列的各自模式替換兩列(「工作類」和「本國」)的NaN。我能得到的模式很容易:各列模式的多列熊貓絨毛

mode = df.filter(["workclass", "native-country"]).mode() 

它返回一個數據框:

workclass native-country 
0 Private United-States 

然而,

df.filter(["workclass", "native-country"]).fillna(mode) 

更換NaN的任何物體的每一列,更不用說該列對應的模式。有沒有一種順利的方式來做到這一點?

回答

5

如果要歸咎於一個數據幀df,你可以fillna通過由位置選擇創建Series通過iloc遺漏值在一些列mode

cols = ["workclass", "native-country"] 
df[cols]=df[cols].fillna(df.mode().iloc[0]) 

或者:

df[cols]=df[cols].fillna(mode.iloc[0]) 

您的解決方案:

df[cols]=df.filter(cols).fillna(mode.iloc[0]) 

樣品:

df = pd.DataFrame({'workclass':['Private','Private',np.nan, 'another', np.nan], 
        'native-country':['United-States',np.nan,'Canada',np.nan,'United-States'], 
        'col':[2,3,7,8,9]}) 

print (df) 
    col native-country workclass 
0 2 United-States Private 
1 3   NaN Private 
2 7   Canada  NaN 
3 8   NaN another 
4 9 United-States  NaN 

mode = df.filter(["workclass", "native-country"]).mode() 
print (mode) 
    workclass native-country 
0 Private United-States 

cols = ["workclass", "native-country"] 
df[cols]=df[cols].fillna(df.mode().iloc[0]) 
print (df) 
    col native-country workclass 
0 2 United-States Private 
1 3 United-States Private 
2 7   Canada Private 
3 8 United-States another 
4 9 United-States Private 
2

你可以那樣做:

df[["workclass", "native-country"]]=df[["workclass", "native-country"]].fillna(value=mode.iloc[0]) 

例如,

import pandas as pd 
d={ 
    'key3': [1,4,4,4,5], 
    'key2': [6,6,4], 
    'key1': [6,4,4], 
} 

df=pd.DataFrame.from_dict(d,orient='index').transpose() 

然後df

key3 key2 key1 
0 1 6  6 
1 4 6  4 
2 4 4  4 
3 4 NaN  NaN 
4 5 NaN  NaN 

然後通過執行:

l=df.filter(["key1", "key2"]).mode() 
df[["key1", "key2"]]=df[["key1", "key2"]].fillna(value=l.iloc[0]) 

我們得到了df

key3 key2 key1 
0 1 6  6 
1 4 6  4 
2 4 4  4 
3 4 6  4 
4 5 6  4