2014-03-05 90 views
2

假設我有一個如下的Pandas DataFrame。這些值基於距離矩陣。獲取Pandas DataFrame的列和行索引對匹配一些條件

A = pd.DataFrame([(1.0,0.8,0.6708203932499369,0.6761234037828132,0.7302967433402214), 
        (0.8,1.0,0.6708203932499369,0.8451542547285166,0.9128709291752769), 
     (0.6708203932499369,0.6708203932499369,1.0,0.5669467095138409,0.6123724356957946), 
     (0.6761234037828132,0.8451542547285166,0.5669467095138409,1.0,0.9258200997725514), 
     (0.7302967433402214,0.9128709291752769,0.6123724356957946,0.9258200997725514,1.0) 
        ]) 

輸出:

Out[65]: 
      0   1   2   3   4 
0 1.000000 0.800000 0.670820 0.676123 0.730297 
1 0.800000 1.000000 0.670820 0.845154 0.912871 
2 0.670820 0.670820 1.000000 0.566947 0.612372 
3 0.676123 0.845154 0.566947 1.000000 0.925820 
4 0.730297 0.912871 0.612372 0.925820 1.000000 

我只想要上三角。

c2 = A.copy() 
c2.values[np.tril_indices_from(c2)] = np.nan 

輸出:

Out[67]: 

     0 1  2   3   4 
    0 NaN 0.8 0.67082 0.676123 0.730297 
    1 NaN NaN 0.67082 0.845154 0.912871 
    2 NaN NaN  NaN 0.566947 0.612372 
    3 NaN NaN  NaN  NaN 0.925820 
    4 NaN NaN  NaN  NaN  NaN 

現在我想基於一些標準的行和列的索引對。例如:獲取值大於0.8的列和行索引。爲此,輸出應該是[1,3],[1,4],[3,4]。對此有何幫助?

回答

3

您可以使用numpy的的argwhere

In [11]: np.argwhere(c2 > 0.8) 
Out[11]: 
array([[1, 3], 
     [1, 4], 
     [3, 4]]) 

要獲得索引/列(而不是他們的整數位置),你可以使用列表理解:

[(c2.index[i], c2.columns[j]) for i, j in np.argwhere(c2 > 0.8)] 
+0

看來我問的問題以一個錯誤的例子。如果我的行和列索引是[1,2,3,5,8] –

+0

太好了! :) 非常感謝。請在答案中編輯它,以便我可以接受它。 –

相關問題