2015-11-30 156 views
0

我有一個數據幀,有一些缺失的值,如何填補相對於date_range缺失的值。數據幀包含District,tech,odate,perc,在這一些日期中失蹤。我想通過比較日期範圍的日期和Districttech來填充perc100Python大熊貓填補缺失值

df: 
    District tech odate perc 
0 Bangalore 2G 2015-04-04 98.45 
1 Bangalore 3G 2015-04-03 96.54 
2 Bangalore 3G 2015-04-01 98.23 
3 Bangalore 2G 2015-04-01 31.25 
4 Bangalore 2G 2015-04-02 92.35 

結果應該是

df: 
    District tech odate perc 
0 Bangalore 2G 2015-04-04 98.45 
1 Bangalore 3G 2015-04-03 96.54 
2 Bangalore 3G 2015-04-01 98.23 
3 Bangalore 2G 2015-04-01 31.25 
4 Bangalore 2G 2015-04-02 92.35 
5 Bangalore 3G 2015-04-02 100 
6 Bangalore 2G 2015-04-03 100 
7 Bangalore 3G 2015-04-04 100 

缺失值由100填補。

回答

0

首先我向每個組添加最大值列odate的行(按列tech分組)。

然後您可以使用自定義resamplehow方法。

 District tech  odate perc 
0 Bangalore 2G 2015-04-04 98.45 
1 Bangalore 3G 2015-04-03 96.54 
2 Bangalore 3G 2015-04-01 98.23 
3 Bangalore 2G 2015-04-01 31.25 
4 Bangalore 2G 2015-04-02 92.35 
#add max date row to each group 

lastRowIndex = df.groupby('tech').odate.idxmax() 
rows = df.ix[lastRowIndex] 
rows['perc'] = 100 
#filled by max value od column odate 
rows['odate'] = df['odate'].max() 
df = pd.concat([df,rows], ignore_index=True) 
df = df.drop_duplicates(subset=['tech', 'odate']) 
df = df.sort_values(['tech', 'odate'], ascending=True).reset_index(drop=True) 
print df 

    District tech  odate perc 
0 Bangalore 2G 2015-04-01 31.25 
1 Bangalore 2G 2015-04-02 92.35 
2 Bangalore 2G 2015-04-04 98.45 
3 Bangalore 3G 2015-04-01 98.23 
4 Bangalore 3G 2015-04-03 96.54 
5 Bangalore 3G 2015-04-04 100.00 
#custom resample 
def custom(x): 
    if x.any(): 
     return x 
    else: 
     return 100 

conversion = {'District' : 'first', 'perc' : custom} 
df = df.groupby('tech').apply(lambda x: x.set_index('odate').resample('D', how=conversion,fill_method='ffill')).reset_index() 
#order columns 
df = df[['District', 'tech', 'odate', 'perc']] 
print df 
    District tech  odate perc 
0 Bangalore 2G 2015-04-01 31.25 
1 Bangalore 2G 2015-04-02 92.35 
2 Bangalore 2G 2015-04-03 100.00 
3 Bangalore 2G 2015-04-04 98.45 
4 Bangalore 3G 2015-04-01 98.23 
5 Bangalore 3G 2015-04-02 100.00 
6 Bangalore 3G 2015-04-03 96.54 
7 Bangalore 3G 2015-04-04 100.00 
+0

請檢查編輯答案,然後按[接受](http://stackoverflow.com/tour)它。謝謝。 – jezrael