2015-05-21 97 views
0

我有這樣一個事務的數據幀使用GROUPBY()。總和()結果來操縱原始數據幀

branch  daqu from to  style color size amount 
5 huadong shanghai C30C C30F EEBW52301M  39 165  3 
8 huadong shanghai C30F C306 EEBW52301M  51 160  2 
2 huadong shanghai C30G C306 EEBW52301M  39 165  10 
9 huadong shanghai C30G C30C EEBW52301M  51 170  1 
1 huadong shanghai C30G C30F EEBW52301M  39 160  7 
7 huadong shanghai C30J C30D EEBW52301M  39 170  2 
6 huadong shanghai C30J C30F EEBW52301M  39 170  4 
3 huadong shanghai C30K C306 EEBW52301M  39 165  1 
0 huadong shanghai C30K C30F EEBW52301M  39 160  7 
4 huadong shanghai C30K C30F EEBW52301M  39 165  6 

的數據意味着我們需要發送「量」風格/顏色/尺寸的量產品從'從'商店到'到'商店。

然後我做了groupby'from'和'to',這樣我就可以看到每個盒子裏會放多少個產品。

print dh_final[['from', 'to', 'amount']].groupby(['from', 'to']).sum() 

      amount 
from to   
C30C C30F  3 
C30F C306  2 
C30G C306  10 
    C30C  1 
    C30F  7 
C30J C30D  2 
    C30F  4 
C30K C306  1 
    C30F  13 

最後,如果從一個店到另一個箱子具有小於5的產品,我想取消與箱相關的交易。那就是我必須從原始數據框中刪除行。如果我手動執行,結果應該看起來像這樣。

branch  daqu from to  style color size amount 
2 huadong shanghai C30G C306 EEBW52301M  39 165  10 
1 huadong shanghai C30G C30F EEBW52301M  39 160  7 
0 huadong shanghai C30K C30F EEBW52301M  39 160  7 
4 huadong shanghai C30K C30F EEBW52301M  39 165  6 

有沒有簡單的方法可以做到這一點?如何使用groupby()。sum()的結果來操作原始數據框?

回答

1

如果我正確理解你想要的是:

In [53]: 
df['sum'] = df.groupby(['from', 'to'])['amount'].transform('sum') 
df[df['sum'] > 5] 

Out[53]: 
    branch  daqu from to  style color size amount sum 
2 huadong shanghai C30G C306 EEBW52301M  39 165  10 10 
1 huadong shanghai C30G C30F EEBW52301M  39 160  7 7 
0 huadong shanghai C30K C30F EEBW52301M  39 160  7 13 
4 huadong shanghai C30K C30F EEBW52301M  39 165  6 13 

所以我在這裏呼籲transformgroupby對象返回了一系列原創DF加入「和」列上排列,然後我就可以過濾和往常一樣。

編輯

其實我覺得你可以做到這一點作爲一個班輪:

In [67]: 
df[df.groupby(['from', 'to'])['amount'].transform('sum') > 5] 

Out[67]: 
    branch  daqu from to  style color size amount 
2 huadong shanghai C30G C306 EEBW52301M  39 165  10 
1 huadong shanghai C30G C30F EEBW52301M  39 160  7 
0 huadong shanghai C30K C30F EEBW52301M  39 160  7 
4 huadong shanghai C30K C30F EEBW52301M  39 165  6