選項1
pd.crosstab
df1
Date Color Jar
0 05-10-2017 Red 1
1 05-10-2017 Green 2
2 05-10-2017 Blue 1
3 05-10-2017 Red 2
4 05-10-2017 Blue 1
5 05-11-2017 Red 2
6 05-11-2017 Green 1
7 05-11-2017 Red 2
8 05-11-2017 Green 1
9 05-11-2017 Blue 1
10 05-11-2017 Blue 2
11 05-11-2017 Red 2
12 05-11-2017 Blue 2
13 05-11-2017 Blue 1
14 05-12-2017 Green 2
15 05-12-2017 Blue 1
16 05-12-2017 Red 1
17 05-12-2017 Blue 2
18 05-12-2017 Blue 2
df1 = pd.crosstab(df2.Date, [df2.Jar, df2.Color])
df1.columns = df1.columns.map('{0[0]} {0[1]}'.format) # borrowed this line from https://stackoverflow.com/a/46102413/4909087
df1 = df1.add_prefix('Jar ')
df1
Jar 1 Blue Jar 1 Green Jar 1 Red Jar 2 Blue Jar 2 Green \
Date
05-10-2017 2 0 1 0 1
05-11-2017 2 2 0 2 0
05-12-2017 1 0 1 2 1
Jar 2 Red
Date
05-10-2017 1
05-11-2017 3
05-12-2017
選項2
pd.get_dummies
和df.groupby
df1 = df1.set_index('Date')
df1 = pd.get_dummies(df1.Jar.astype(str).str.cat(df1.Color, sep=' '))\
.add_prefix('Jar ').groupby(level=0).sum()
df1
Jar 1 Blue Jar 1 Green Jar 1 Red Jar 2 Blue Jar 2 Green \
Date
05-10-2017 2 0 1 0 1
05-11-2017 2 2 0 2 0
05-12-2017 1 0 1 2 1
Jar 2 Red
Date
05-10-2017 1
05-11-2017 3
05-12-2017 0
性能
小
100 loops, best of 3: 13.4 ms per loop # pivot_table
100 loops, best of 3: 9.05 ms per loop # stacking, grouping, unstacking
100 loops, best of 3: 10.4 ms per loop # crosstab
100 loops, best of 3: 3.57 ms per loop # get_dummies
大(df * 10000
)
10 loops, best of 3: 42.8 ms per loop # pivot_table
1 loop, best of 3: 913 ms per loop # stacking, grouping, unstacking
10 loops, best of 3: 43.1 ms per loop # crosstab
1 loop, best of 3: 885 ms per loop # get_dummies
你想用什麼取決於你的數據。
首先,WOW在手機上...... –
我花了很長時間的午餐;) –
如果您的問題已被回答,您可以[接受最有幫助的](https://stackoverflow.com/help/有人-答案)。 –