0
比方說,我的數據框看起來是這樣的:熊貓 - 累計值轉換爲實際值
date,site,country_code,kind,ID,rank,votes,sessions,avg_score,count
2017-03-20,website1,US,0,84,226,0.0,15.0,3.370812,53.0
2017-03-21,website1,US,0,84,214,0.0,15.0,3.370812,53.0
2017-03-22,website1,US,0,84,226,0.0,16.0,3.370812,53.0
2017-03-23,website1,US,0,84,234,0.0,16.0,3.369048,54.0
2017-03-24,website1,US,0,84,226,0.0,16.0,3.369048,54.0
2017-03-25,website1,US,0,84,212,0.0,16.0,3.369048,54.0
2017-03-26,website1,US,0,84,228,0.0,16.0,3.369048,54.0
2017-02-15,website2,AU,1,91,144,4.0,148.0,4.727272,521.0
2017-02-16,website2,AU,1,91,144,3.0,147.0,4.727272,524.0
2017-02-17,website2,AU,1,91,100,4.0,148.0,4.727272,524.0
2017-02-18,website2,AU,1,91,118,6.0,149.0,4.727272,527.0
2017-02-19,website2,AU,1,91,114,4.0,151.0,4.727272,529.0
在最後的count
列是累計計數。 我需要做的是找到特定 日期+網站+國家+實物+ ID元組的實際計數,這將導致:
date,site,country_code,kind,ID,rank,votes,sessions,avg_score,count
2017-03-20,website1,US,0,84,226,0.0,15.0,3.370812,0.0
2017-03-21,website1,US,0,84,214,0.0,15.0,3.370812,0.0
2017-03-22,website1,US,0,84,226,0.0,16.0,3.370812,0.0
2017-03-23,website1,US,0,84,234,0.0,16.0,3.369048,1.0
2017-03-24,website1,US,0,84,226,0.0,16.0,3.369048,0.0
2017-03-25,website1,US,0,84,212,0.0,16.0,3.369048,0.0
2017-03-26,website1,US,0,84,228,0.0,16.0,3.369048,0.0
2017-02-15,website2,AU,1,91,144,4.0,148.0,4.727272,0.0
2017-02-16,website2,AU,1,91,144,3.0,147.0,4.727272,3.0
2017-02-17,website2,AU,1,91,100,4.0,148.0,4.727272,0.0
2017-02-18,website2,AU,1,91,118,6.0,149.0,4.727272,3.0
2017-02-19,website2,AU,1,91,114,4.0,151.0,4.727272,2.0
我知道這將涉及groupby
呼叫,但我沒有除了這個之外還有什麼想法。假設元組的第一個實例的計數爲0
。 任何幫助將令人敬畏。謝謝
謝謝但是這將導致元組''(2017-02-15,website2,AU,1,91)'的值爲'467',而它應該是0 – Craig
我認爲OP想要的東西是:'df .groupby('site')['count']。diff()。fillna(0)' – MaxU
@MaxU非常感謝!我誤解了這個問題。 –