10
我有一些代碼,彙總了包含著名泰坦尼克數據集如下一個數據幀:重新編制索引一個多指標的電平,以任意的次序在熊貓
titanic['agecat'] = pd.cut(titanic.age, [0, 13, 20, 64, 100],
labels=['child', 'adolescent', 'adult', 'senior'])
titanic.groupby(['agecat', 'pclass','sex']
)['survived'].mean()
這產生具有多指標以下數據幀基於所述groupby
呼叫:
agecat pclass sex
adolescent 1 female 1.000000
male 0.200000
2 female 0.923077
male 0.117647
3 female 0.542857
male 0.125000
adult 1 female 0.965517
male 0.343284
2 female 0.868421
male 0.078125
3 female 0.441860
male 0.159184
child 1 female 0.000000
male 1.000000
2 female 1.000000
male 1.000000
3 female 0.483871
male 0.324324
senior 1 female 1.000000
male 0.142857
2 male 0.000000
3 male 0.000000
Name: survived, dtype: float64
不過,我想多指標的agecat
一級天然有序的,而不是按字母順序排列,即:['child', 'adolescent', 'adult', 'senior']
。不過,如果我嘗試使用reindex
做到這一點:
titanic.groupby(['agecat', 'pclass','sex'])['survived'].mean().reindex(
['child', 'adolescent', 'adult', 'senior'], level='agecat')
它不具備對所得到的數據幀的多指標有任何影響。應該這樣工作,還是我使用了錯誤的方法?
我想你建議,*應*工作,請參閱此處的評論:https://github.com/pydata/pandas/blob/master/pandas/core/index.py#L1346,請op en – Jeff
不幸的是,OP是正確的,'Dataframe.reindex()'在使用'level'關鍵字時被破壞,即使在這個日期的最新的熊貓開發分支中。請參閱https://github.com/pydata/pandas/issues/4088 –