熊貓找到列之間的匹配值

>>import pandas as pd 
>>d = {'a':[5,4,3,1,2],'b':[1,2,3,4,5]} 
>>df = pd.DataFrame(d) 
>>df 
    a b 
0 5 1 
1 4 2 
2 3 3 
3 1 4 
4 2 5

鑑於沒有一間值的重複重新排序指標，B，有沒有辦法來計算變量指數這樣的：熊貓找到列之間的匹配值

df['a'] = df['b'][indices]

滿意？在這種情況下，

>> indices = [4,3,2,0,1] 

>> df['b'][indices] 
4 5 
3 4 
2 3 
0 1 
1 2

來源

2015-06-20 ejang

我猜幼稚的做法是：

def getIndices(a,b): 
    rVal = [] 
    for i in a: 
     index = b.index(i) 
     rVal.append(index) 
    return rVal 

a = [5,4,3,1,2] 
b = [1,2,3,4,5] 

result = getIndices(a,b) 
print result 
# prints [4, 3, 2, 0, 1]

我覺得這會給你O(nlogn)時間複雜度。

來源

2015-06-20 03:57:45 Sait

你可以試試 -

indices = [df['b'][df['b'] == row['a']].index[0] for idx, row in df.iterrows()] 
indices 
>> [4, 3, 2, 0, 1]

來源

2015-06-20 04:00:20

您可以使用numpy.argsort()：

import numpy as np 
a = np.array(["c", "b", "a", "x", "e", "d"]) 
b = np.array(["d", "a", "b", "c", "x", "e"]) 
idx_a = np.argsort(a) 
idx_b = np.argsort(b) 
print b[idx_b[idx_a]]

結果是：

['c' 'b' 'a' 'x' 'e' 'd']

來源

2015-06-20 04:02:25 HYRY

這可以用簡單的Python來完成（不知道是否有更智能的熊貓專用方法）。

d = {k:v for v,k in enumerate(list(df['a']))} 
indices = [i[0] for i in sorted(enumerate(list(df['b'])), 
           key=lambda x: d.get(x[1]))]

如果a某些元件不b反之亦然，你將不得不使用更智能的關鍵功能是寬容缺失值（並決定要如何處理這種情況，對於這個問題）。

來源

2015-06-20 04:04:56 abeboparebop

熊貓找到列之間的匹配值

回答

相關問題