如何根據熊貓數據框中的多列獲取百分比數？

-1

我在數據框中有20列。我列出其中4這裏作爲例子：如何根據熊貓數據框中的多列獲取百分比數？

is_guarantee：0或1
hotel_star：0，1，2，3，4，5
ORDER_STATUS：40，60，80
旅程（標籤）： 0，1，2

is_guarantee hotel_star order_status journey 
0    0   5   60  0 
1    1   5   60  0 
2    1   5   60  0 
3    0   5   60  1 
4    0   4   40  0 
5    0   4   40  1 
6    0   4   40  1 
7    0   3   60  0 
8    0   2   60  0 
9    1   5   60  0 
10    0   2   60  0 
11    0   2   60  0

Click to View Image

但該系統需要輸入發生矩陣像以下格式函數：

Click to View Image

身體能幫助嗎？

df1 = pd.DataFrame(index=range(0,20)) 
df1['is_guarantee'] = np.random.choice([0,1], df1.shape[0]) 
df1['hotel_star'] = np.random.choice([0,1,2,3,4,5], df1.shape[0]) 
df1['order_status'] = np.random.choice([40,60,80], df1.shape[0]) 
df1['journey '] = np.random.choice([0,1,2], df1.shape[0])

來源

2017-12-18 DataHolic

您到目前爲止嘗試過什麼？ –

我想在問題中將您的數據編輯爲_text_。我無法將圖片複製並粘貼到我的終端，我不想從頭開始輸入。請讓每個人的生活變得輕鬆，將您的數據和期望的輸出作爲文本發佈在您的問題中。沒有數據=沒有幫助。 –

@jezrael ...沒有人在這裏迫害你，至少我是這樣的。我告訴你我尊重你的知識。不幸的是，有時候你會對網站做一些不健康的事情。這不是我的看法。無論如何，我已經重新開放這個問題，享受。 –

我想你需要：

重塑通過melt和groupby與size得到計數，通過unstack
重塑然後劃分和每行和參加MultiIndex到index：

df = (df.melt('journey') 
     .astype(str) 
     .groupby(['variable', 'journey','value']) 
     .size() 
     .unstack(1, fill_value=0)) 

df = (df.div(df.sum(1), axis=0) 
     .mul(100) 
     .add_prefix('journey_') 
     .set_index(df.index.map(' = '.join)) 
     .rename_axis(None, 1)) 

print (df) 

        journey_0 journey_1 
hotel_star = 2  100.000000 0.000000 
hotel_star = 3  100.000000 0.000000 
hotel_star = 4  33.333333 66.666667 
hotel_star = 5  80.000000 20.000000 
is_guarantee = 0 66.666667 33.333333 
is_guarantee = 1 100.000000 0.000000 
order_status = 40 33.333333 66.666667 
order_status = 60 88.888889 11.111111

來源

2017-12-18 11:44:06 jezrael

非常感謝！讓我研究它：） – DataHolic

如何根據熊貓數據框中的多列獲取百分比數？

回答

相關問題