首先我向每個組添加最大值列odate
的行(按列tech
分組)。
然後您可以使用自定義resamplehow
方法。
District tech odate perc
0 Bangalore 2G 2015-04-04 98.45
1 Bangalore 3G 2015-04-03 96.54
2 Bangalore 3G 2015-04-01 98.23
3 Bangalore 2G 2015-04-01 31.25
4 Bangalore 2G 2015-04-02 92.35
#add max date row to each group
lastRowIndex = df.groupby('tech').odate.idxmax()
rows = df.ix[lastRowIndex]
rows['perc'] = 100
#filled by max value od column odate
rows['odate'] = df['odate'].max()
df = pd.concat([df,rows], ignore_index=True)
df = df.drop_duplicates(subset=['tech', 'odate'])
df = df.sort_values(['tech', 'odate'], ascending=True).reset_index(drop=True)
print df
District tech odate perc
0 Bangalore 2G 2015-04-01 31.25
1 Bangalore 2G 2015-04-02 92.35
2 Bangalore 2G 2015-04-04 98.45
3 Bangalore 3G 2015-04-01 98.23
4 Bangalore 3G 2015-04-03 96.54
5 Bangalore 3G 2015-04-04 100.00
#custom resample
def custom(x):
if x.any():
return x
else:
return 100
conversion = {'District' : 'first', 'perc' : custom}
df = df.groupby('tech').apply(lambda x: x.set_index('odate').resample('D', how=conversion,fill_method='ffill')).reset_index()
#order columns
df = df[['District', 'tech', 'odate', 'perc']]
print df
District tech odate perc
0 Bangalore 2G 2015-04-01 31.25
1 Bangalore 2G 2015-04-02 92.35
2 Bangalore 2G 2015-04-03 100.00
3 Bangalore 2G 2015-04-04 98.45
4 Bangalore 3G 2015-04-01 98.23
5 Bangalore 3G 2015-04-02 100.00
6 Bangalore 3G 2015-04-03 96.54
7 Bangalore 3G 2015-04-04 100.00
請檢查編輯答案,然後按[接受](http://stackoverflow.com/tour)它。謝謝。 – jezrael