它看起來像Multiindex
列:
print n.columns
MultiIndex(levels=[[u'Count', u'sum'], [u'', u'F', u'M']],
labels=[[0, 0, 1], [1, 2, 0]],
names=[None, u'Gender'])
所以首先選擇列F
和M
由using-slicers。 然後fillna
通過0
除以sum
柱:
idx = pd.IndexSlice
F = n.loc[:, idx['Count','F']]
M = n.loc[:, idx['Count','M']]
sum = n.loc[:, idx['sum','']]
n['%F'] = F.fillna(0)/sum * 100
n['%M'] = M.fillna(0)/sum * 100
print n
Count sum %F %M
Gender F M
Name
Aaban NaN 10.285710 10.285710 0.000000 100.000000
Aabfla 7.000000 NaN 7.000000 100.000000 0.000000
Aabid NaN 5.000000 5.000000 0.000000 100.000000
Aabrielle 5.000000 NaN 5.000000 100.000000 0.000000
Aadarn NaN 8.521739 8.521739 0.000000 100.000000
Aadan NaN 12.000000 12.000000 0.000000 100.000000
Aadar NaN 11.285710 11.285710 0.000000 100.000000
Aaden 5.000000 279.002857 284.002857 1.760546 98.239454
Aade NaN 5.000000 5.000000 0.000000 100.000000
Aadhav NaN 12.750000 12.750000 0.000000 100.000000
Aadhavan NaN 6.333333 6.333333 0.000000 100.000000
Aadhi NaN 6.000000 6.000000 0.000000 100.000000
Aadhira 0.888857 NaN 9.000007 9.876181 0.000000
Aadhve 79.875000 NaN 79.875000 100.000000 0.000000
Aadhven NaN 5.000000 5.000000 0.000000 100.000000
Aadi 5.333333 55.583333 60.910007 8.756087 91.254846
Aadian NaN 5.000000 5.000000 0.000000 100.000000
Aadil NaN 12.913003 12.913003 0.000000 100.000000
Aadin NaN 12.000000 12.000000 0.000000 100.000000
如果你知道你會怎麼做這在SQL中的大熊貓[文件]那麼也許這部分(http://pandas.pydata.org/pandas- docs/version/0.18.1/comparison_with_sql.html)將有所幫助。 – pbreach