2017-06-19 85 views
0

捲起事件爲元數據我有一個看起來像如何從原始數據幀

Name,Report_ID,Amount,Flag,Actions 
Fizz,123,5,,A 
Fizz,123,10,Y,A 
Buzz,456,10,,B 
Buzz,456,40,,C 
Buzz,456,70,,D 
Bazz,678,100,Y,F 

從這些個體經營數據,我想創建一個新的數據幀捕獲的各種統計數據/元的名字。主要是項目的總結和計數/唯一條目的計數。我想數據框的輸出如下所示:

Report_ID,Number of Flags,Number of Entries, Total,Unique Actions 
123,1,2,15,1 
456,0,3,120,3 
678,1,1,100,1 

我用groupby試過,但我不能合併所有單獨的分組的正確對象重新走到一起。到目前爲止,我已經嘗試

totals = raw_data.groupby('Report_ID')['Amount'].sum() 
event_count = raw_data.groupby('Report_ID').size() 
num_actions = raw_data.groupby('Report_ID').Actions.nunique() 

output = pd.concat([totals,event_count,num_actions]) 

當我嘗試這個我得到TypeError: cannot concatenate a non-NDFrame object。任何幫助,將不勝感激!

回答

1

您可以在groupby

f = dict(Flag=['count', 'size'], Amount='sum', Actions='nunique') 
df.groupby('Report_ID').agg(f) 

      Flag  Amount Actions 
      count size sum nunique 
Report_ID       
123   1 2  15  1 
456   0 3 120  3 
678   1 1 100  1 
0

串聯時,您只需要指定axis=1使用agg

event_count.name = 'Event Count' # Name the Series, as you did not group on one. 
>>> pd.concat([totals, event_count, num_actions], axis=1) 

      Amount Event Count Actions 
Report_ID        
123   15   2  1 
456   120   3  3 
678   100   1  1