2016-01-24 304 views
1
datetime col_A col_B 
1/1/2012 125.501 A 
1/2/2012 NaN  A 
1/3/2012 125.501 A 
1/4/2013 NaN  A 
1/5/2013 125.501 B 
2/28/2013 125.501 B 
2/28/2014 125.501 B 
1/2/2016 125.501 B 
1/4/2016 125.501 B 
2/28/2016 NaN  B 

Fill in missing values in pandas dataframe using mean GROUPBY圍護與所有字符串列,我填寫爲col_a遺漏值是這樣的:對大熊貓據幀

df = df.groupby([df.index.month, df.index.day]).transform(lambda x: x.fillna(x.mean())) 

然而,當我這樣做,它使col_B去遠。我怎樣才能保留所有字符串的col_B?

+1

在左邊,你需要'DF [ '爲col_a'] =',而不是僅僅'DF ='。您用一列替換整個數據幀。只需更換該列。這裏沒關係,但是我也會在右側指定'col_A',而不是依賴於'mean'來忽略'col_B' – JohnE

+0

如果我有多個列,比如'col_A' ,這也可以工作:'df [['col_A','col_C']] = ...'? – user308827

回答

1

我想你可以添加col_A

df['col_A'] = df.groupby([df.index.month, df.index.day])['col_A'].transform(lambda x: 
                      x.fillna(x.mean())) 
print df 
       col_A col_B 
datetime     
2012-01-01 125.501  A 
2012-01-02 125.501  A 
2012-01-03 125.501  A 
2013-01-04 125.501  A 
2013-01-05 125.501  B 
2013-02-28 125.501  B 
2014-02-28 125.501  B 
2016-01-02 125.501  B 
2016-01-04 125.501  B 
2016-02-28 125.501  B