2017-12-27 285 views
2

我想增加一列,從每個CUSTOMER_ID最大日期分鐘日期的減法的這個表熊貓 - 減去最大日期分鐘日期爲每個組

輸入結果:

action_date customer_id 
2017-08-15  1 
2017-08-21  1 
2017-08-21  1 
2017-09-02  1 
2017-08-28  2 
2017-09-29  2 
2017-10-15  3 
2017-10-30  3 
2017-12-05  3 

並獲得該表

輸出:

action_date customer_id diff 
2017-08-15  1   18 
2017-08-21  1   18 
2017-08-21  1   18 
2017-09-02  1   18 
2017-08-28  2   32 
2017-09-29  2   32 
2017-10-15  3   51 
2017-10-30  3   51 
2017-12-05  3   51 

我嘗試這樣的代碼,但是卻讓很多NaN的

group = df.groupby(by='customer_id') 
df['diff'] = (group['action_date'].max() - group['action_date'].min()).dt.days 

回答

2

,你可以用transform方法:

In [23]: df['diff'] = df.groupby('customer_id') \ 
         ['action_date'] \ 
         .transform(lambda x: (x.max()-x.min()).days) 

In [24]: df 
Out[24]: 
    action_date customer_id diff 
0 2017-08-15   1 18 
1 2017-08-21   1 18 
2 2017-08-21   1 18 
3 2017-09-02   1 18 
4 2017-08-28   2 32 
5 2017-09-29   2 32 
6 2017-10-15   3 51 
7 2017-10-30   3 51 
8 2017-12-05   3 51 
+0

謝謝!你真棒:) – Superbman

+0

@Superbman,很高興我可以幫助:) – MaxU