Groupby最大值並返回熊貓數據框中的對應行

我的數據框由學生，日期和測試分數組成。我想找到每個學生的最大日期並返回相應的行（最終，我最感興趣的是學生最近的分數）。我怎麼能在熊貓身上做到這一點？Groupby最大值並返回熊貓數據框中的對應行

比方說，我的數據框看起來像這樣（簡化版本）：

Student_id Date  Score 
Tina1  1/17/17 .95 
John2  1/18/17 .8 
Lia1  12/13/16 .845 
John2  1/25/17 .975 
Tina1  1/1/17 .78 
Lia1  6/12/16 .89

這就是我想要的：

Student_id Date  Score 
Tina1  1/17/17 .95 
Lia1  12/13/16 .845 
John2  1/25/17 .975

我發現這對左右，但它給了我一個位置索引出邊界錯誤。

df.iloc[df.groupby('student_id').apply(lambda x: x['date'].idxmax())]

什麼是其他方法來實現同樣的事情？

來源

2017-07-07 Jane Sully

您可以按日期排序的數據幀，然後使用groupby.tail得到最新記錄：

df.iloc[pd.to_datetime(df.Date, format='%m/%d/%y').argsort()].groupby('Student_id').tail(1) 

#Student_id  Date Score 
#2  Lia1 12/13/16 0.845 
#0 Tina1 1/17/17 0.950 
#3 John2 1/25/17 0.975

或避免排序，使用idxmax（這個作品，如果你沒有複製指數）：

df.loc[pd.to_datetime(df.Date, format='%m/%d/%y').groupby(df.Student_id).idxmax()] 

# Student_id  Date Score 
#3  John2 1/25/17 0.975 
#2  Lia1 12/13/16 0.845 
#0  Tina1 1/17/17 0.950

來源

2017-07-07 17:17:28 Psidom

Groupby最大值並返回熊貓數據框中的對應行

回答

相關問題