我想你需要通過unstack
重塑,那麼一些數據清洗是必要的:
df = df.set_index(['MONTH','WEEKDAY','EVAL'])['0'].unstack()
#if get ValueError: Index contains duplicate entries, cannot reshape
#if duplicates and necessary aggregate data with mean, sum...
#df = df.groupby(['MONTH','WEEKDAY','EVAL'])['0'].mean().unstack()
#df = df.pivot_table(index=['MONTH','WEEKDAY'], columns='EVAL', values='0', aggfunc='mean')
print (df)
EVAL 0 1
MONTH WEEKDAY
1 0 400 20
1 300 20
2 0 200 35
1 450 26
df = df.sort_index(level=[1,0])
.reset_index(level=0, drop=True)
.add_prefix('EVAL_')
.reset_index()
.rename_axis(None, axis=1)
print (df)
WEEKDAY EVAL_0 EVAL_1
0 0 400 20
1 0 200 35
2 1 300 20
3 1 450 26
樣品與重複:
print (df)
MONTH WEEKDAY EVAL 0
0 1 0 0 400
1 1 0 1 20
2 1 1 0 300
3 1 1 1 20
4 2 0 0 200
5 2 0 1 35
6 2 1 0 450
7 2 1 1 26
8 2 1 1 100 <-duplicate
df = df.groupby(['MONTH','WEEKDAY','EVAL'])['0'].mean().unstack()
df = df.sort_index(level=[1,0])
.reset_index(level=0, drop=True)
.add_prefix('EVAL_')
.reset_index()
.rename_axis(None, axis=1)
print (df)
WEEKDAY EVAL_0 EVAL_1
0 0 400 20
1 0 200 35
2 1 300 20
3 1 450 63 <- value is mean of (100 + 26)/2
它的工作對我來說,當我改變了'[0]。將'unstack()'分配給'[0] .unstack()'。謝謝。 – Dinosaurius
我忘記了,對不起。您可以通過'result = df.groupby(['MONTH','WEEKDAY','EVAL'])size()。reset_index(name ='COL')''將'0'更改爲相同的值。謝謝。 – jezrael