我有一個相當複雜的數據幀,看起來像這樣:熊貓數據框計算
df = pd.DataFrame({'0': {('Total Number of End Points', '0.01um', '0hr'): 12,
('Total Number of End Points', '0.1um', '0hr'): 8,
('Total Number of End Points', 'Control', '0hr'): 4,
('Total Number of End Points', '0.01um', '24hr'): 18,
('Total Number of End Points', '0.1um', '24hr'): 12,
('Total Number of End Points', 'Control', '24hr'): 6,
('Total Vessel Length', '0.01um', '0hr'): 12,
('Total Vessel Length', '0.1um', '0hr'): 8,
('Total Vessel Length', 'Control', '0hr'): 4,
('Total Vessel Length', '0.01um', '24hr'): 18,
('Total Vessel Length', '0.1um', '24hr'): 12,
('Total Vessel Length', 'Control', '24hr'): 6},
'1': {('Total Number of End Points', '0.01um', '0hr'): 12,
('Total Number of End Points', '0.1um', '0hr'): 8,
('Total Number of End Points', 'Control', '0hr'): 4,
('Total Number of End Points', '0.01um', '24hr'): 18,
('Total Number of End Points', '0.1um', '24hr'): 12,
('Total Number of End Points', 'Control', '24hr'): 6,
('Total Vessel Length', '0.01um', '0hr'): 12,
('Total Vessel Length', '0.1um', '0hr'): 8,
('Total Vessel Length', 'Control', '0hr'): 4,
('Total Vessel Length', '0.01um', '24hr'): 18,
('Total Vessel Length', '0.1um', '24hr'): 12,
('Total Vessel Length', 'Control', '24hr'): 6},
'2': {('Total Number of End Points', '0.01um', '0hr'): 12,
('Total Number of End Points', '0.1um', '0hr'): 8,
('Total Number of End Points', 'Control', '0hr'): 4,
('Total Number of End Points', '0.01um', '24hr'): 18,
('Total Number of End Points', '0.1um', '24hr'): 12,
('Total Number of End Points', 'Control', '24hr'): 6,
('Total Vessel Length', '0.01um', '0hr'): 12,
('Total Vessel Length', '0.1um', '0hr'): 8,
('Total Vessel Length', 'Control', '0hr'): 4,
('Total Vessel Length', '0.01um', '24hr'): 18,
('Total Vessel Length', '0.1um', '24hr'): 12,
('Total Vessel Length', 'Control', '24hr'): 6}})
print(df)
0 1 2
Total Number of End Points 0.01um 0hr 12 12 12
24hr 18 18 18
0.1um 0hr 8 8 8
24hr 12 12 12
Control 0hr 4 4 4
24hr 6 6 6
Total Vessel Length 0.01um 0hr 12 12 12
24hr 18 18 18
0.1um 0hr 8 8 8
24hr 12 12 12
Control 0hr 4 4 4
24hr 6 6 6
我試圖通過相應的控制水平平均列來劃分每個值。我嘗試了以下,但它沒有奏效。
df2 = df.divide(df.xs('Control', level=1).mean(axis=1), axis='index')
我對Python和熊貓很新,所以我傾向於用MS Excel術語思考這個問題。
如果它是在Excel中爲A1的式( '0.01um', '0HR' '的終點總數',0)將看起來是:
=A1/AVERAGE($A$5:$C$5)
B1(「總的終點, '0.01um', '0HR號碼',1)將是:
=B1/AVERAGE($A$5:$C$5)
和A2( '終點', '0.01um', '24小時',0的總數)將是
=A1/AVERAGE($A$6:$C$6)
這個例子的期望的結果將是:
0 1 2
Total Number of End Points 0.01um 0hr 3 3 3
24hr 3 3 3
0.1um 0hr 2 2 2
24hr 2 2 2
Control 0hr 1 1 1
24hr 1 1 1
Total Vessel Length 0.01um 0hr 3 3 3
24hr 3 3 3
0.1um 0hr 2 2 2
24hr 2 2 2
Control 0hr 1 1 1
24hr 1 1 1
注:有很多指標和列的真實數據。
你能提供所需輸出的一個例子? – Andrew
當我把你的數據放到DataFrame中時,它與你在print(df)中得到的不同。 df = ...和print(df)是兩個不同的DataFrame。您的打印(df)與上面的代碼無關。您的輸入欄爲['a','b'],但您的印刷欄爲[0,1,2]。你能否全部保持一致?謝謝。 –
@MarkGraph哎呀..你是對的..我會修復它。 – agf1997