2016-04-29 24 views
1

鑑於以下,multiIndex ED pd.DataFrame從pd.df.ix []視圖中按行進行選擇?

Type      p&l position rolldate  value  vola 
Date  Symbol               
2008-01-02 AC  1757.2168  1 1201132800 45588.9161 480.6781 
      AUD   0.0000  0 1205280000 59872.0044 542.8067 
      BAX  551.1540  2 1208736000 165621.7706 125.8527 
      BTP   0.0000  0   -1   NaN  0.0000 
      C   674.4908  2 1202342400 14407.1226 137.4325 
      CAC40  0.0000  0 1200441600 55565.0000 580.2757 
      CAD   0.0000  0 1205280000 68784.0414 593.7115 
      CC  422.1133  1 1202428800 14276.9608 197.4064 
      CGB  482.2597  1 1203638400 79655.5288 299.6622 
      CHF  -1216.9798  -1 1205280000 76431.4406 391.3853 
      CL   0.0000  0 1200355200 67824.0741 1268.3927 
      COIL  0.0000  0 1199750400 66612.2004 1088.8291 
      CT  296.1601  1 1202774400 23447.7124 239.7177 
      D   217.8649  1 1202688000 13201.2527 210.3416 
      DAX   0.0000  0 1205798400 200712.5000 1644.8412 
      DX  469.7712  -1 1205193600 51749.7277 215.9024 
      EMD   0.0000  0 1205366400 58135.8932 753.5315 
      ES   0.0000  0 1205366400 49649.3736 632.5416 
      ESTX50 -570.0000  1 1205798400 43780.0000 381.5206 
      EUR   0.0000  0 1205280000 125382.9657 605.9757 
      GBL  -1020.0000  -1 1204588800 114130.0000 355.3088 
      GBM  -730.0000  -1 1204588800 108670.0000 229.3634 
      GBP  -93.6138  1 1205280000 84095.0095 477.9144 
      GBS  -825.0000  -3 1204588800 103630.0000 100.1981 
      GBX   0.0000  0 1204588800 91280.0000 548.0709 
      GC   0.0000  0 1201219200 58551.1983 678.5194 
      GE  578.7037  2 1221523200 164760.3486 110.1067 
      GF  204.2484  -1 1203984000 36254.0850 261.8514 

我試圖通過DateSymbol然後column名稱來訪問行的價值。 到目前爲止,我只得到了這一步:

In [38]: df.ix['2008-01-02', 'AC'] 
Out[38]: 
     Type  
Value benchmark    NaN 
     cm   1.201824e+09 
     cm_next  1.204330e+09 
     margin    NaN 
     nav     NaN 
     p&l   1.757217e+03 
     position  1.000000e+00 
     rolldate  1.201133e+09 
     value  4.558892e+04 
     vola   4.806781e+02 
Name: (2008-01-02, AC), dtype: float64 

這是接近我想要的東西;但是,我似乎無法弄清楚如何訪問Type行。

df[df.loc['2008-01-02', 'AC']]['p&l'] # raises a KeyError 

In [39]: df.ix['2008-01-02', 'AC']['Value'] # Gets me closer, but not quite there 
Out[39]: 
Type 
benchmark    NaN 
cm   1.201824e+09 
cm_next  1.204330e+09 
margin    NaN 
nav     NaN 
p&l   1.757217e+03 
position  1.000000e+00 
rolldate  1.201133e+09 
value  4.558892e+04 
vola   4.806781e+02 
Name: (2008-01-02, AC), dtype: float64 

In [40]: df.ix['2008-01-02', 'AC']['Value']['p&l'] # raises another KeyError 

我不能依靠DataFrame.head().tail()或任何其它種類的數字索引的,因爲我不能肯定的列將總是以相同的順序,也不是在每次運行數相等。

回答

2

我認爲你可以使用loc - 見docs - using slicers

print df 
         p&l position rolldate  value  vola 
Date  Symbol               
2008-01-02 AC  1757.2168   1 1201132800 45588.9161 480.6781 
      AUD  0.0000   0 1205280000 59872.0044 542.8067 
      BAX  551.1540   2 1208736000 165621.7706 125.8527 
      BTP  0.0000   0   -1   NaN 0.0000 
      C  674.4908   2 1202342400 14407.1226 137.4325 
      CAC40  0.0000   0 1200441600 55565.0000 580.2757 
      CAD  0.0000   0 1205280000 68784.0414 593.7115 
      CC  422.1133   1 1202428800 14276.9608 197.4064 
      CGB  482.2597   1 1203638400 79655.5288 299.6622 

idx = pd.IndexSlice 
print df.loc[idx['2008-01-02', 'AC'], idx['rolldate']] 
1201132800.0 

編輯:

如果Multiindex位於列過,加上第一級 - Value

print df 
Type     Value            
Date     p&l position rolldate  value  vola 
2008-01-02 AC  1757.2168  1 1201132800 45588.9161 480.6781 
      AUD  0.0000  0 1205280000 59872.0044 542.8067 
      BAX  551.1540  2 1208736000 165621.7706 125.8527 
      BTP  0.0000  0   -1   NaN 0.0000 
      C  674.4908  2 1202342400 14407.1226 137.4325 
      CAC40  0.0000  0 1200441600 55565.0000 580.2757 
      CAD  0.0000  0 1205280000 68784.0414 593.7115 
      CC  422.1133  1 1202428800 14276.9608 197.4064 
      CGB  482.2597  1 1203638400 79655.5288 299.6622 


idx = pd.IndexSlice 
print df.loc[idx['2008-01-02', 'AC'], idx['Value','rolldate']] 
1201132800.0 
+0

'df.loc [idx ['2008-01-02','AC'],idx ['rolldate']]'爲我產生一個'KeyError':''標籤[rolldate]不在[index]中' – nlsdfnbch

+0

什麼是'打印df.columns'?它是列表嗎? – jezrael

+0

這是一個MultiIndex - MultiIndex(levels = [['Value'],['benchmark','cm','cm_next','margin','nav','p&l','position','rolldate ,'value','vola']],標籤= [[0,0,0,0,0,0,0,0,0],[0,1,2,3,4,5 ,6,7,8,9]], names = [None,'Type'])' – nlsdfnbch

相關問題