2016-05-15 47 views
1

我試圖編寫一個腳本,該腳本將使用任意數量的實驗條件(例如3種不同濃度的藥物)和任意數量的每種條件的重複(DataFrame)即,試驗1-3),看起來是這樣的:雙方的條件和審判從列名中的模式創建多索引

 100_uM_Drug_Trial_1 100_uM_Drug_Trial_2 10_uM_Drug_Trial_1 \ 
0    459.924747   635.685284   518.163653 
1    459.458934   636.249568   518.445279 
2    460.006374   636.435523   518.743388 
3    460.002453   636.794022   518.895792 
4    460.598404   636.103206   518.836557 
5    460.309564   637.187444   518.976234 
6    460.609499   636.335023   519.005662 
7    460.843505   637.123839   519.041012 
8    460.969187   637.047453   518.880728 
9    460.832477   637.231533   519.108122 
10   461.255201   638.176752   518.979086 
11   461.310764   636.924448   518.979923 
12   461.507783   637.824450   519.117064 
13   461.116555   637.145600   519.106675 
14   461.891845   638.136241   519.531348 
15   461.746859   637.819223   519.161308 
16   461.840650   637.977134   519.203945 
17   462.028374   638.474671   519.184845 
18   461.726244   638.039615   519.225926 
19   462.128634   638.624309   519.177030 
20   461.242868   637.636891   519.460114 
21   462.201164   638.493620   519.469176 
22   464.078771   637.749872   519.505141 
23   464.605662   639.119425   519.654590 
24   464.352002   638.789306   519.947157 
25   464.485028   638.656634   519.822459 
26   464.506035   639.428889   519.906759 
27   464.834154   638.481042   520.143631 
28   464.886412   639.267176   520.218972 
29   465.414446   638.661687   520.384017 

...和多指標它,所以它看起來是這樣的:

Condition  100_uM_Drug       10_uM_Drug 
Trial   1     2     1 
0    459.924747   635.685284   518.163653 
1    459.458934   636.249568   518.445279 
2    460.006374   636.435523   518.743388 
3    460.002453   636.794022   518.895792 
4    460.598404   636.103206   518.836557 
5    460.309564   637.187444   518.976234 
6    460.609499   636.335023   519.005662 
7    460.843505   637.123839   519.041012 
8    460.969187   637.047453   518.880728 
9    460.832477   637.231533   519.108122 
10   461.255201   638.176752   518.979086 
11   461.310764   636.924448   518.979923 
12   461.507783   637.824450   519.117064 
13   461.116555   637.145600   519.106675 
14   461.891845   638.136241   519.531348 
15   461.746859   637.819223   519.161308 
16   461.840650   637.977134   519.203945 
17   462.028374   638.474671   519.184845 
18   461.726244   638.039615   519.225926 
19   462.128634   638.624309   519.177030 
20   461.242868   637.636891   519.460114 
21   462.201164   638.493620   519.469176 
22   464.078771   637.749872   519.505141 
23   464.605662   639.119425   519.654590 
24   464.352002   638.789306   519.947157 
25   464.485028   638.656634   519.822459 
26   464.506035   639.428889   519.906759 
27   464.834154   638.481042   520.143631 
28   464.886412   639.267176   520.218972 
29   465.414446   638.661687   520.384017 

我已經嘗試了幾種方法包括通過正則表達式篩選列名,但我沒有得到一個一切工作。有沒有一個簡單而快捷的方法來做到這一點,我錯過了?

THX

回答

1

你可以使用MultiIndex.from_tuples()而分裂column名(see docs):

df.columns = pd.MultiIndex.from_tuples([('_'.join(col.split('_')[:3]), col.split('_')[-1]) for col in df.columns], names=['Drug', 'Trial']) 

生產:

Drug 100_uM_Drug    10_uM_Drug 
Trial   1   2   1 
0    0 459.924747 635.685284 
1    1 459.458934 636.249568 
2    2 460.006374 636.435523 
3    3 460.002453 636.794022 
+0

完美,正是我需要的 - 感謝 – rchurt

+0

你歡迎 - http://stackoverflow.com/help/someone-answers – Stefan