查找numpy數組之間的匹配對Python

我在Python中有一個numpy數組，其中包含分類問題的標籤。數組在兩個初始相同數組的並列之後得出。查找numpy數組之間的匹配對Python

labels = np.concatenate((labels1, labels2)) #labels1 and labels2 are identical

我要生成其中將包含所有的標籤（從labels1和標籤2相等）的指標，也有消極的的雙正/負對。例如，如果我輸入如下：

labels = {1, 1, 2, 2, 3, 1, 1, 2, 2, 3} # labels1 = labels2 = {1, 1, 2, 2, 3}

然後我想返回爲陽性對：

positive_pairs = {{1, 6}, {1, 7}, {2, 6}, {2, 7}, {3, 8}, {3, 9}, {4, 8}, {4, 9}, {5, 10}} # i dont want to have {1,2} or {3, 4} in within the positives 
negative_pairs = {{1, 8}, {1, 9}, ...}

我怎樣才能在Python這樣做呢？

編輯：什麼情況下labels1和labels2不相等？

來源

2017-07-09 Jose Ramon

你好，請問你想你的輸出是一個numpy的二維數組或一些其他類型的？ –

這並不重要。理想情況下，我想要4個矩陣a，b，c，d，其中b是正對，c，d是負數。 –

在「對」的背景下，什麼「積極」和「消極」意味着什麼都不清楚。此外，這些對的元素代表什麼？例如，1和6在'positive_pairs'的'{1,6}'中的含義是什麼。 –

下面是positive_pairs一個解決方案：

labels1 = np.array([1, 1, 2, 2, 3]) 
length1 = len(labels1) 
positive_pairs = [] 
for ii, label in enumerate(labels1, 1): 
    for other in np.where(labels1 == label)[0] + length1 + 1: 
     positive_pairs.append((ii, other))

negative_pairs被留作練習。

來源

2017-07-09 14:35:43

如果兩個矩陣標籤1和標籤2不相等怎麼辦？ –

@JoseRamon：我不知道，這不是問題中提出的問題。我想你可以推斷你現在問的這個案子。 –

我確定我可以按照枚舉作品的方式進行操作。 Ans也在我的情況下，我需要更改labels1，因爲不是我不重複兩次相同的數組。 –

可以完成像這樣

labels_1 = np.array([1,1,2,2,3]) 
labels_2 = np.array([1,1,2,2,3]) 
n = len(labels_1) 
positive_pairs = [(i1+1, i2+n+1) for i1, l in enumerate(labels_1) 
           for i2 in np.where(labels_2 == l)[0]]

[（1,6），（1,7），（2,6），（2,7），...]

negative_pairs = [(i1+1, i2+n+1) for i1, l in enumerate(labels_1) 
           for i2 in np.where(labels_2 != l)[0]]

[（1,8），（1,9），（1,10），（2,8），...]

雖然，我不確定這是最有效的方法。

來源

2017-07-09 14:37:45 tarashypka

如果兩個矩陣標籤1和標籤2不相等怎麼辦？ –

@JoseRamon只要labels_1和labels_2相等或不相等，它就應該可以工作。 – tarashypka

outp = [] 
len1 = len(labels) // 2 # assume initially labels was [label1, label1] 
label1 = labels[:len1] 
label2 = labels[len1:] 
set1 = set(label1) 
for v in set1: 
    eq1 = np.where(label1 == v)[0] + 1 
    eq2 = np.where(label2 == v)[0] + len1 + 1 
    outp.append(np.transpose([np.tile(eq1, len(eq2)), np.repeat(eq2, len(eq1))])) 
outp = np.concatenate(outp).tolist() 

# Edit: Find "negative pairs" 
eq3 = np.indices((len1,))[0][np.in1d(label2, list(set1), invert=True)] + len1 + 1 
outn = np.transpose([np.tile(np.arange(len1), len(eq3)), np.repeat(eq3, len1)]).tolist()

來源

2017-07-09 17:07:39

如何在這種情況下返回負值對？ –

我試圖將label1 == v更改爲！=，但似乎並非如此簡單。 –

@JoseRamon'v'是從'set1'繪製的，它是'label1'中所有不同值的集合，因此條件'label1！= v'永遠不會成立。也就是說，如果將'label1 == v'更改爲'label1！= v'，'eq1'將始終是一個空數組（意味着一個零長度數組）。您需要做的是從'label2'中找到與'set1'中的任何元素* *不同的所有元素。我將編輯我的答案以表明這可以完成。 –

我的解決方案使用numpy廣播和np.where()。

x, y = np.where(label1[np.newaxis, :] == label2[:, np.newaxis]) 
result = np.vstack([x, y+len(label1)]).T + 1

我認爲這是一種有效的/ numpy的方式來解決您的問題。但是，tarashypka的解決方案在更大的數據集上更快。（誰知道爲什麼？）

def where_method(a, b): 
    x, y = np.where(a[np.newaxis, :] == b[:, np.newaxis]) 
    return np.vstack([x, y+len(labels1)]).T + 1 

def for_append_method(a, b): 
    length1 = len(a) 
    positive_pairs = [] 
    for ii, label in enumerate(a, 1): 
     for other in np.where(b == label)[0] + length1 + 1: 
      positive_pairs.append((ii, other)) 
    return positive_pairs 

labels1 = np.sort(np.random.randint(low=10, high=7500, size=10000)) 
labels2 = labels1 

%timeit where_method(labels1, labels2) 
%timeit for_append_method(labels1, labels2)

 
326 ms ± 2.98 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 
122 ms ± 1.71 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

來源

2018-01-30 00:33:17 meta4

查找numpy數組之間的匹配對Python

回答

相關問題