2016-09-19 60 views
3

我有一個熊貓數據幀是這樣的:熊貓數據框中回報指數不準確的小數

   0   1   2   3   4   5  \ 
    event_at                
    0.00  1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 
    0.01  0.975381 0.959061 0.979856 0.985625 0.986080 0.976601 
    0.02  0.959103 0.932374 0.966486 0.976037 0.976791 0.961114 
    0.03  0.946154 0.911362 0.955820 0.968362 0.969353 0.948785 
    0.04  0.935378 0.894024 0.946924 0.961940 0.963129 0.938518 
    0.05  0.926099 0.879201 0.939248 0.956385 0.957744 0.929672 
    0.06  0.917608 0.865726 0.932212 0.951282 0.952796 0.921574 
    ...... 
    0.96  0.072472 0.012264 0.117352 0.217737 0.228561 0.082670 
    0.97  0.066553 0.010632 0.109468 0.207225 0.217870 0.076244 
    0.98  0.060532 0.009069 0.101313 0.196119 0.206555 0.069677 
    0.99  0.054657 0.007642 0.093212 0.184828 0.195031 0.063237 
    1.00  0.019128 0.001314 0.039558 0.100442 0.108064 0.023328 

我想獲得的所有索引

>>> df.index 
[0.0, 0.01, 0.02, 0.029999999999999999, 0.040000000000000001, 0.050000000000000003, 0.059999999999999998, 
... 
0.95999999999999996, 0.96999999999999997, 0.97999999999999998, 0.98999999999999999, 1.0] 


# What I expect is like: 

    [0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 
     ... 
     0.96, 0.97, 0.98, 0.99, 1.0] 

此浮點問題讓我得到他的異常:

>>> df.loc[0.35].values 
Traceback (most recent call last): 
    File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1395, in _has_valid_type 
    error() 
    File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1390, in error 
    (key, self.obj._get_axis_name(axis))) 
KeyError: 'the label [0.35] is not in the [index]' 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
    File "J:\Workspace\dataset_loader.py", line 171, in <module> 
    print(y_pred_cox_alldep.loc[0.35].values) 
    File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1296, in __getitem__ 
    return self._getitem_axis(key, axis=0) 
    File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1466, in _getitem_axis 
    self._has_valid_type(key, axis) 
    File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1403, in _has_valid_type 
    error() 
    File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1390, in error 
    (key, self.obj._get_axis_name(axis))) 
KeyError: 'the label [0.35] is not in the [index]' 
+0

一般來說,浮點數或平等測試的索引有這個問題。很容易針對對方測試整數,但只能用浮點數「靠近」。你可能也想看看字符串索引。 – hpaulj

回答

2

你可以這樣做(假設我們想要得到一行0.96索引,這是內部的y所表示爲0.95999999999):

In [466]: df.index 
Out[466]: Float64Index([0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.95999999999, 0.97, 0.98, 0.99, 1.0], dtype='float64') 

In [467]: df.ix[df.index[np.abs(df.index - 0.96) < 1e-6]] 
Out[467]: 
      0   1   2   3   4  5 
0.96 0.072472 0.012264 0.117352 0.217737 0.228561 0.08267 

,或者,如果你可以改變(圓形)索引:

In [430]: df.index = [0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.95999999999, 0.97, 0.98, 0.99, 1.0] 

In [431]: df 
Out[431]: 
      0   1   2   3   4   5 
0.00 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 
0.01 0.975381 0.959061 0.979856 0.985625 0.986080 0.976601 
0.02 0.959103 0.932374 0.966486 0.976037 0.976791 0.961114 
0.03 0.946154 0.911362 0.955820 0.968362 0.969353 0.948785 
0.04 0.935378 0.894024 0.946924 0.961940 0.963129 0.938518 
0.05 0.926099 0.879201 0.939248 0.956385 0.957744 0.929672 
0.06 0.917608 0.865726 0.932212 0.951282 0.952796 0.921574 
0.96 0.072472 0.012264 0.117352 0.217737 0.228561 0.082670 
0.97 0.066553 0.010632 0.109468 0.207225 0.217870 0.076244 
0.98 0.060532 0.009069 0.101313 0.196119 0.206555 0.069677 
0.99 0.054657 0.007642 0.093212 0.184828 0.195031 0.063237 
1.00 0.019128 0.001314 0.039558 0.100442 0.108064 0.023328 

In [432]: df.index 
Out[432]: Float64Index([0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.95999999999, 0.97, 0.98, 0.99, 1.0], dtype='float64') 

In [433]: df.ix[.96] 
... skipped ... 
KeyError: 0.96 

我們再來一輪指數:

In [434]: df.index = df.index.values.round(2) 

In [435]: df.index 
Out[435]: Float64Index([0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.96, 0.97, 0.98, 0.99, 1.0], dtype='float64') 

In [436]: df.ix[.96] 
Out[436]: 
0 0.072472 
1 0.012264 
2 0.117352 
3 0.217737 
4 0.228561 
5 0.082670 
Name: 0.96, dtype: float64 

更新:從Pandas 0.20.1 the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers開始。