2017-08-09 81 views
1

我有一個GROUPBY創建一個數據框:Python的大熊貓 - 轉換列百分比上GROUPBY DF

hmdf = pd.DataFrame(hm01) 
new_hm01 = hmdf[['FinancialYear','Month','FirstReceivedDate']] 

hm05 = new_hm01.pivot_table(index=['FinancialYear','Month'], aggfunc='count') 
vals1 = ['April ', 'May  ', 'June  ', 'July  ', 'August ', 'September', 'October ', 'November ', 'December ', 'January ', 'February ', 'March '] 

df_hm = new_hm01.groupby(['Month', 'FinancialYear']).size().unstack(fill_value=0).rename(columns=lambda x: '{}'.format(x)) 
df_hml = df_hm.reindex(vals1) 

的DF如下:

FinancialYear 2014/2015 2015/2016 2016/2017 2017/2018 
Month    
April    34   24   22   20 
May     29   26   21   25 
June    19   39   22   20 
July    23   39   18   20 
August    36   30   34   0 
September   35   23   41   0 
October    36   37   27   0 
November   38   31   30   0 
December   36   41   23   0 
January    34   30   35   0 
February   37   26   37   0 
March    36   31   33   0 

列名和變量(threeYr,twoYr,oneYr,Yr),我想轉換數據框,以便數字是每個列的總數的百分比,但我無法讓它工作。

這就是我想要的:

FinancialYear  2014/2015 2015/2016 2016/2017 2017/2018 
Month    
April     9%   6%   6%   24% 
May      7%   7%   6%   29% 
June     5%   10%   6%   24% 
July     6%   10%   5%   24% 
August     9%   8%   10%   0% 
September    9%   6%   12%   0% 
October     9%   10%   8%   0% 
November    10%   8%   9%   0% 
December    9%   11%   7%   0% 
January     9%   8%   10%   0% 
February    9%   7%   11%   0% 
March     9%   8%   10%   0% 

誰能幫我做這個?

編輯:我試過在這個鏈接發現的響應:pandas convert columns to percentages of the totals .....我無法讓我的數據框工作+它不能很好地解釋(對我)如何使它適用於任何DF。我相信John Galt的迴應比這個迴應更好(我的觀點)。

+1

的[熊貓轉換列的百分比可能的複製總計](https://stackoverflow.com/questions/42006346/pandas-convert-columns-to-percentages-of-the-totals) –

+0

接下來,我無法讓它爲我的數據框工作。因此我的問題。謝謝你,雖然 – ScoutEU

回答

5

這裏有一種方法

In [1371]: (100. * df/df.sum()).round(0) 
Out[1371]: 
       2014/2015 2015/2016 2016/2017 2017/2018 
FinancialYear 
April    9.0  6.0  6.0  24.0 
May     7.0  7.0  6.0  29.0 
June     5.0  10.0  6.0  24.0 
July     6.0  10.0  5.0  24.0 
August    9.0  8.0  10.0  0.0 
September   9.0  6.0  12.0  0.0 
October    9.0  10.0  8.0  0.0 
November   10.0  8.0  9.0  0.0 
December    9.0  11.0  7.0  0.0 
January    9.0  8.0  10.0  0.0 
February    9.0  7.0  11.0  0.0 
March    9.0  8.0  10.0  0.0 

而且,如果你想四捨五入到1位小數有值與「%」的字符串

In [1375]: (100. * df/df.sum()).round(1).astype(str) + '%' 
Out[1375]: 
       2014/2015 2015/2016 2016/2017 2017/2018 
FinancialYear 
April    8.7%  6.4%  6.4%  23.5% 
May    7.4%  6.9%  6.1%  29.4% 
June    4.8%  10.3%  6.4%  23.5% 
July    5.9%  10.3%  5.2%  23.5% 
August    9.2%  8.0%  9.9%  0.0% 
September   8.9%  6.1%  12.0%  0.0% 
October   9.2%  9.8%  7.9%  0.0% 
November   9.7%  8.2%  8.7%  0.0% 
December   9.2%  10.9%  6.7%  0.0% 
January   8.7%  8.0%  10.2%  0.0% 
February   9.4%  6.9%  10.8%  0.0% 
March    9.2%  8.2%  9.6%  0.0% 
+0

約翰,這工作非常好!非常感謝你:)...非常感謝不同的解決方案 – ScoutEU