從熊貓羣總結最後一條條目由

我有一個如下csv。從熊貓羣總結最後一條條目由

a,b,c,d 
A,A1,10,B1 
A,A1,20,B1 
A,A1,30,B1 
A,A1,10,B4 
A,A1,20,B4 
A,A1,10,B5 
A,A1,10,B6 
B,A2,10,B7 
B,A2,20,B1 
B,A2,100,B1

我想取每組的最後一行，併爲每個'a'列總結c列。

我能夠採取持續使用.last()但停留在每做總和「一」，其中一個是第一groupby標準

>>> tmp.groupby(['a','b','d']).nth(-1) 
      c 
    a b d  
    A A1 B1 30 
     B4 20 
     B5 10 
     B6 10 
    B A2 B1 100 
     B7 10 
    >>> tmp.groupby(['a','b','d']).nth(-1)['c'].sum() 
    180

而不是180，我需要70，（A組的總和）和110（B組的總和）

我認爲分組使用時最後（）被丟失或第n個（-1）

來源

2017-07-14 pythonRcpp

編輯的問題。我想我犯了一個錯誤。 – pythonRcpp

您可以通過level=0第一級與骨料sum添加sum或其他groupby：

df = tmp.groupby(['a','b','d'])['c'].nth(-1).sum(level=0) 
print (df) 
a 
A  70 
B 110 
Name: c, dtype: int64

df = tmp.groupby(['a','b','d'])['c'].nth(-1).groupby(level=0).sum() 
print (df) 
a 
A  70 
B 110 
Name: c, dtype: int64

同樣的，last：

df = tmp.groupby(['a','b','d'])['c'].last().sum(level=0) 
print (df) 
a 
A  70 
B 110 
Name: c, dtype: int64

df = tmp.groupby(['a','b','d'])['c'].last().groupby(level=0).sum() 
print (df) 
a 
A  70 
B 110 
Name: c, dtype: int64

來源

2017-07-14 17:08:22 jezrael

哇@jezrael你是一個願望。 Chrysm sastaťako vymôjpriateľ – pythonRcpp

@pythonRcpp - thank you。 ;） – jezrael

tmp.groupby(['a','b'])['c'].last()

a b 
A A1  20 
    A2 100 
Name: c, dtype: int64

來源

2017-07-14 16:14:04 thorbjorn444

對不起，我發佈時發生錯誤，我已更正問題 – pythonRcpp

你可以試試這個drop_duplicates然後groupby

df.drop_duplicates(subset=['a', 'b','d'], take_last=True).groupby('a')['c'].sum() 

Out[104]: 
a 
A  70 
B 110

來源

2017-07-14 17:16:11 Wen

從熊貓羣總結最後一條條目由

回答

相關問題