2017-07-01 68 views
0

我有在Excel數據如下蟒熊貓數據幀的concat和組由功能

category size1 size2 size3 
cat1 10 20 30 
cat2 20 10 15 
cat3 30 20 10 

我想兩個報告/輸出練成如下

#1) 
Category-sizetype-value 
cat1 size1 10 
cat1 size2 20 
cat1 size3 30 
cat2 size1 20 

...

#2) 
Category-size-value-value counts(i.e how many time specific size value appears) 
cat1 size1 10 3 times 
cat1 size2 20 2 times 
cat1 size3 30 1 time 
cat2 size1 20 4 times 

... 我寫的代碼迄今爲止,感謝一些指針,爲什麼pd.concat不在這裏工作?並不能牛逼

import pandas as pd 
path_to_file = 'C:\Users\Niru\Desktop\cat-sizes.xlsx' 
xl = pd.ExcelFile(path_to_file) 
print(xl.sheet_names) 
df = xl.parse('Sheet1') 
#print(df.head()) 
print(df.columns) 
frames = [] 
for i in df.columns: 
    dfd = "df.loc[:,['Category','" +i+"']]" 
    frames.append(dfd) 
print(pd.concat(frames)) 

+0

級聯不起作用,因爲列表變量'frames'沒有充滿dataframes,但字符串。 – Xukrao

回答

1

你的榜樣數據和輸出困惑我一點點,但我想這是你想要的。

#Q1: 

df1=pd.melt(df, id_vars=['category'], value_vars=['size1','size2','size3']) 


Out[66]: 
    category variable value 
0  cat1 size1  10 
1  cat2 size1  20 
2  cat3 size1  30 
3  cat1 size2  20 
4  cat2 size2  10 
5  cat3 size2  20 
6  cat1 size3  30 
7  cat2 size3  15 
8  cat3 size3  10 

#Q2: 

df1['counts']=df1.groupby(['variable','value']).transform('count') 

Out[69]: 
    category variable value counts 
0  cat1 size1  10  1 
1  cat2 size1  20  1 
2  cat3 size1  30  1 
3  cat1 size2  20  2 
4  cat2 size2  10  1 
5  cat3 size2  20  2 
6  cat1 size3  30  1 
7  cat2 size3  15  1 
8  cat3 size3  10  1 

或Q2

df1['counts']=df1.groupby(['variable']).transform('count') 

Out[71]: 
    category variable value counts 
0  cat1 size1  10  3 
1  cat2 size1  20  3 
2  cat3 size1  30  3 
3  cat1 size2  20  3 
4  cat2 size2  10  3 
5  cat3 size2  20  3 
6  cat1 size3  30  3 
7  cat2 size3  15  3 
8  cat3 size3  10  3 
+0

謝謝!完全融化是我們正在尋找不透明的數據。謝謝溫家寶! – Niru