我的表看起來像這樣:我應該如何減去兩個數據框和Pandas並顯示需要的輸出?
In [82]:df.head()
Out[82]:
MatDoc MatYr MvT Material Plnt SLoc Batch Customer AmountLC Amount ... PO MatYr.1 MatDoc.1 Order ProfitCtr SLED/BBD PstngDate EntryDate Time Username
0 4912693062 2015 551 100062 HDC2 0001 5G30MC1A11 NaN 9.03 9.06 ... NaN NaN NaN NaN IN1165B085 26.01.2016 01.08.2015 01.08.2015 01:13:16 O33462
1 4912693063 2015 501 166 HDC2 0004 NaN NaN 0.00 0.00 ... NaN NaN NaN NaN IN1165B085 NaN 01.08.2015 01.08.2015 01:13:17 O33462
2 4912693320 2015 551 101343 HDC2 0001 5G28MC1A11 NaN 53.73 53.72 ... NaN NaN NaN NaN IN1165B085 25.01.2016 01.08.2015 01.08.2015 01:16:30 O33462
在這裏,我需要通過組數據上Order
列,僅總和AmountLC
column.Then我需要檢查的Order
列的值,例如,它應該是存在於兩個MvT101group
和MvT102group
。如果Order
匹配兩組數據,那麼我需要從MvT101group
減去MvT102group
。和顯示
Order|Plnt|Material|Batch|Sum101=SumofMvt101ofAmountLC|Sum102=SumofMvt102ofAmountLC|(Sum101-Sum102)/100
我所做的是我首先提出只含101和102新DF:Mvt101
和MvT102
MvT101 = df.loc[df['MvT'] == 101]
MvT102 = df.loc[df['MvT'] == 102]
然後,我通過Order
分組,並得到列的總和值
MvT101group = MvT101.groupby('Order', sort=True)
In [76]:
MvT101group[['AmountLC']].sum()
Out[76]:
Order AmountLC
1127828 16348566.88
1127829 22237710.38
1127830 29803745.65
1127831 30621381.06
1127832 33926352.51
MvT102group = MvT102.groupby('Order', sort=True)
In [77]:
MvT102group[['AmountLC']].sum()
Out[77]:
Order AmountLC
1127830 53221.70
1127831 651475.13
1127834 67442.16
1127835 2477494.17
1128622 218743.14
在此之後,我無法理解我應該怎麼寫我的查詢。 如果你願意,請問我任何進一步的細節。這裏是我工作的CSV文件Link