2017-10-09 77 views
0

的休息我有一個數據集如下:總結在大熊貓行,而不改變數據集

Time Sent  Contract  B/S  Price Qty 
9 10:05:46 815 A    BUY  0.55 600 
10 10:05:46 815 A    BUY  0.55 153600 
11 10:08:47 988 A    SELL 0.56 154200 
113 10:20:52 823 B    BUY  0.39 505000 
114 14:33:59 424 B    SELL 0.39 505000 
31 11:31:44 657 C    BUY  0.92 201000 
32 11:36:54 947 C    SELL 0.92 201000 
33 11:42:52 228 C    BUY  0.92 166400 
34 11:42:52 228 C    BUY  0.92 12900 

我希望在這裏實現是總結數量當且僅當所有其它列匹配。在這種情況下,期望的輸出是

Time Sent  Contract  B/S  Price Qty 
9 10:05:46 815 A    BUY  0.55 154200 
11 10:08:47 988 A    SELL 0.56 154200 
113 10:20:52 823 B    BUY  0.39 505000 
114 14:33:59 424 B    SELL 0.39 505000 
31 11:31:44 657 C    BUY  0.92 201000 
32 11:36:54 947 C    SELL 0.92 201000 
33 11:42:52 228 C    BUY  0.92 179300 

我對數據幀的佈局非常高興,並且不希望使用df.groupby(),這將陷入困境當前的訂單。另請注意,第一列是原始索引位置,我尚未重置。

任何幫助將不勝感激。謝謝!

+0

使用'df.groupby(['Time Sent','Contract','B/S','Price'],as_index = False)['Qty']。sum()'? – Zero

回答

1

你需要從index第一和骨料通過aggfirstindexsum創建列Qty列:

df = (df.reset_index() 
     .groupby(['Time Sent', 'Contract', 'B/S', 'Price'], as_index=False, sort=False) 
     .agg({'index':'first', 'Qty':'sum'}) 
     .set_index('index') 
     .rename_axis(None)) 
print (df) 
     Time Sent Contract B/S Price  Qty 
9 10:05:46 815  A BUY 0.55 154200 
11 10:08:47 988  A SELL 0.56 154200 
113 10:20:52 823  B BUY 0.39 505000 
114 14:33:59 424  B SELL 0.39 505000 
31 11:31:44 657  C BUY 0.92 201000 
32 11:36:54 947  C SELL 0.92 201000 
33 11:42:52 228  C BUY 0.92 179300 

如果指標值是沒有必要的,應復位:

df=df.groupby(['Time Sent','Contract','B/S','Price'],as_index=False,sort=False)['Qty'].sum() 
print (df) 
     Time Sent Contract B/S Price  Qty 
0 10:05:46 815  A BUY 0.55 154200 
1 10:08:47 988  A SELL 0.56 154200 
2 10:20:52 823  B BUY 0.39 505000 
3 14:33:59 424  B SELL 0.39 505000 
4 11:31:44 657  C BUY 0.92 201000 
5 11:36:54 947  C SELL 0.92 201000 
6 11:42:52 228  C BUY 0.92 179300 
+0

一如既往的鼻子上。謝謝! –