查找三個陣列中最接近的三個x，y點

在Python中，我有三個包含x和y座標的列表。每個列表包含128個點。我怎樣才能以有效的方式找到最接近的三點？查找三個陣列中最接近的三個x，y點

這是我的工作Python代碼，但它是沒有效率不夠：

def findclosest(c1, c2, c3): 
     mina = 999999999 
     for i in c1: 
      for j in c2: 
      for k in c3: 
       # calculate sum of distances between points 
       d = xy3dist(i,j,k) 
       if d < mina: 
        mina = d 

    def xy3dist(a, b, c): 
     l1 = math.sqrt((a[0]-b[0]) ** 2 + (a[1]-b[1]) ** 2) 
     l2 = math.sqrt((b[0]-c[0]) ** 2 + (b[1]-c[1]) ** 2) 
     l3 = math.sqrt((a[0]-c[0]) ** 2 + (a[1]-c[1]) ** 2)  
     return l1+l2+l3

任何想法如何這可以使用numpy的做什麼？

來源

2014-09-19 AlterSchwede

什麼是距離函數xy3dist？ – 2014-09-19 15:34:00

抱歉 - 已更新。可以通過刪除sqrt來簡化它，但不會提高速度。我需要另一個解決方案:-) – AlterSchwede 2014-09-19 15:37:32

只是爲了澄清。通過「最接近的三分」你仍然意味着每個列表中的一分？ – Ghanima 2014-09-19 15:39:16

您可以使用NumPy的廣播功能，以矢量化兩個內環：


import numpy as np 

def findclosest(c1, c2, c3): 
    c1 = np.asarray(c1) 
    c2 = np.asarray(c2) 
    c3 = np.asarray(c3) 

    for arr in (c1, c2, c3): 
     if not (arr.ndim == 2 and arr.shape[1] == 2): 
      raise ValueError("expected arrays of 2D coordinates") 

    min_val = np.inf 
    min_pos = None 

    for a, i in enumerate(c1): 
     d = xy3dist(i, c2.T[:,:,np.newaxis], c3.T[:,np.newaxis,:]) 
     k = np.argmin(d) 

     if d.flat[k] < min_val: 
      min_val = d.flat[k] 
      b, c = np.unravel_index(k, d.shape) 
      min_pos = (a, b, c) 

     print a, min_val, d.min() 

    return min_val, min_pos 

def xy3dist(a, b, c): 
    l1 = np.sqrt((a[0]-b[0]) ** 2 + (a[1]-b[1]) ** 2) 
    l2 = np.sqrt((b[0]-c[0]) ** 2 + (b[1]-c[1]) ** 2) 
    l3 = np.sqrt((a[0]-c[0]) ** 2 + (a[1]-c[1]) ** 2)  
    return l1+l2+l3 

np.random.seed(1234) 
c1 = np.random.rand(5, 2) 
c2 = np.random.rand(9, 2) 
c3 = np.random.rand(7, 2) 

val, pos = findclosest(c1, c2, c3) 

a, b, c = pos 
print val, xy3dist(c1[a], c2[b], c3[c])

也有可能向量化所有的3環

 
def findclosest2(c1, c2, c3): 
    c1 = np.asarray(c1) 
    c2 = np.asarray(c2) 
    c3 = np.asarray(c3) 
    d = xy3dist(c1.T[:,:,np.newaxis,np.newaxis], c2.T[:,np.newaxis,:,np.newaxis], c3.T[:,np.newaxis,np.newaxis,:]) 
    k = np.argmin(d) 
    min_val = d.flat[k] 
    a, b, c = np.unravel_index(k, d.shape) 
    min_pos = (a, b, c) 
    return min_val, min_pos

If your arrays are very big, findclosest可能優於findclosest2，因爲它使用較少的內存。（如果你的數組是巨大的，僅矢量化的一個最裏面的循環。）

您可以谷歌「numpy的廣播」，以瞭解更多什麼np.newaxis確實

來源

2014-09-19 15:59:16

對於numpy用戶可能微不足道，但我需要三點的x，y座標......好的 - 這是微不足道的 - 解決:-) – AlterSchwede 2014-09-19 16:34:59

@AlterSchwede這就是爲什麼他的第二個解決方案返回'min_pos'，這只是最低點的每個數組中的索引。你可以用一個簡單的'c1 [a]，c2 [b]，c3 [c]'來提取它們。 – 2014-09-19 16:41:10

此解決方案比原始版本快100倍 - 非常感謝！ – AlterSchwede 2014-09-19 16:52:22

讓我們嘗試一些時間不同的解決方案看。

我打算用numpy的隨機函數初始化三個數組。如果您有現成的變量是元組列表或列表列表，請在其上調用np.array。

import numpy as np 

c1 = np.random.normal(size=(128, 2)) 
c2 = np.random.normal(size=(128, 2)) 
c3 = np.random.normal(size=(128, 2))

首先讓我們來一次你的代碼，所以我們有一個起點。這可能是有益的

def findclosest(c1, c2, c3): 
    mina = 999999999 
    for i in c1: 
     for j in c2: 
      for k in c3: 
       # calculate sum of distances between points 
       d = xy3dist(i,j,k) 
       if d < mina: 
        mina = d 
    return mina 

def xy3dist(a, b, c): 
    l1 = math.sqrt((a[0]-b[0]) ** 2 + (a[1]-b[1]) ** 2) 
    l2 = math.sqrt((b[0]-c[0]) ** 2 + (b[1]-c[1]) ** 2) 
    l3 = math.sqrt((a[0]-c[0]) ** 2 + (a[1]-c[1]) ** 2)  
    return l1+l2+l3 

%timeit findclosest(c1, c2, c3) 
# 1 loops, best of 3: 23.3 s per loop

一個功能是scipy.spatial.distance.cdist，其計算分兩個陣列之間的所有成對距離。因此，我們可以使用它來預先計算並存儲所有距離，然後只需從這些數組中獲取並添加距離即可。我也將使用itertools.product來簡化循環，儘管它不會做任何加速工作。

from scipy.spatial.distance import cdist 
from itertools import product 

def findclosest_usingcdist(c1, c2, c3): 
    dists_12 = cdist(c1, c2) 
    dists_23 = cdist(c2, c3) 
    dists_13 = cdist(c1, c3) 

    min_dist = np.inf 
    ind_gen = product(range(len(c1)), range(len(c2)), range(len(c3))) 
    for i1, i2, i3 in ind_gen: 
     dist = dists_12[i1, i2] + dists_23[i2, i3] + dists_13[i1, i3] 
     if dist < min_dist: 
      min_dist = dist 
      min_points = (c1[i1], c2[i2], c3[i3]) 

    return min_dist, min_points 

%timeit findclosest_usingcdist(c1, c2, c3) 
# 1 loops, best of 3: 2.02 s per loop

因此使用cdist購買我們一個數量級的加速。

然而，這甚至沒有比較@ pv的答案。他的一些實現被剝離出來，與以前的解決方案進行了更好的比較（請參閱@pv針對實現返回點的答案）。

def findclosest2(c1, c2, c3): 
    d = xy3dist(c1.T[:,:,np.newaxis,np.newaxis], 
       c2.T[:,np.newaxis,:,np.newaxis], 
       c3.T[:,np.newaxis,np.newaxis,:]) 
    k = np.argmin(d) 
    min_val = d.flat[k] 
    i1, i2, i3 = np.unravel_index(k, d.shape) 
    min_points = (c1[i1], c2[i2], c3[i3]) 
    return min_val, min_points 

def xy3dist(a, b, c): 
    l1 = np.sqrt((a[0]-b[0]) ** 2 + (a[1]-b[1]) ** 2) 
    l2 = np.sqrt((b[0]-c[0]) ** 2 + (b[1]-c[1]) ** 2) 
    l3 = np.sqrt((a[0]-c[0]) ** 2 + (a[1]-c[1]) ** 2)  
    return l1+l2+l3 

%timeit findclosest_usingbroadcasting(c1, c2, c3) 
# 100 loops, best of 3: 19.1 ms per loop

所以這是一個巨大的加速，絕對是正確的答案。

來源

2014-09-19 16:01:05

看起來不錯，但我需要三個最接近的點的xy座標...... – AlterSchwede 2014-09-19 16:33:20

@AlterSchwede嘆息，我希望你澄清，當我在評論中確切地問這個問題。無論如何，我修改了後兩種解決方案以返回要點。 PV的答案已經達到了90％。 – 2014-09-19 16:39:59

對不起 - 現在解決了，並感謝爲我解決性能問題。 – AlterSchwede 2014-09-19 16:59:09

查找三個陣列中最接近的三個x，y點

回答

相關問題