2014-11-09 22 views
3

我試圖在彼此的最大距離內找到(x,y)點對。我認爲最簡單的做法是生成一個DataFrame並逐個遍歷每個點,計算在給定點(x_0,y_0)的距離r內是否有座標爲(x,y)的點。然後,所有的2熊貓:找到最大距離內的點

%pylab inline 
import pandas as pd 

def find_nbrs(low, high, num, max_d): 
    x = random.uniform(low, high, num) 
    y = random.uniform(low, high, num) 
    points = pd.DataFrame({'x':x, 'y':y}) 

    tot_nbrs = 0 

    for i in arange(len(points)): 
     x_0 = points.x[i] 
     y_0 = points.y[i] 

     pt_nbrz = points[((x_0 - points.x)**2 + (y_0 - points.y)**2) < max_d**2] 
     tot_nbrs += len(pt_nbrz) 
     plot (pt_nbrz.x, pt_nbrz.y, 'r-') 

    plot (points.x, points.y, 'b.') 
    return tot_nbrs 

print find_nbrs(0, 1, 50, 0.1) 
  1. 先分割發現對的總數,它並不總是找到合適的對(我看是沒有標籤的規定距離內的點)。

  2. 如果我寫plot(..., 'or'),它會突出顯示所有要點。這意味着pt_nbrz = points[((x_0 - points.x)**2 + (y_0 - points.y)**2) < max_d**2]至少返回一個(x,y)。爲什麼?如果比較結果爲False,它不應該返回一個空數組嗎?

  3. 如何在熊貓中更優雅地完成上述所有操作?例如,不必遍歷每個元素。

+0

糾正我,如果我錯了,但你正在做一個O(n)的搜索時我想你想要的是一個O(n^2)搜索。你基本上檢查x0:y0,x1:y1,x2:y2之間的距離......當我想你想要做的是檢查x0:y0,x0:y1,... x1:y0,x1:y1, x1:y2 .... – Greg 2014-11-09 07:21:30

+0

但是,如果我錯了你想要什麼,那麼這將很適合你http://stackoverflow.com/questions/1401712/how-can-the-euclidean-distance-be-calculated -with-numpy – Greg 2014-11-09 07:25:55

+0

感謝您的鏈接。儘管有答案,但我在計算如何使用numpy.linalg.norm計算距離時遇到了一些麻煩。在這個例子中,a和b應該是什麼格式? 回覆:O(n^2),我認爲這就是我正在做的事情:即遍歷每個數據框元素,並找到滿足比較的所有其他元素。這應該確定所有的雙胞胎,兩次,所以要得到的數字,我只是將最後的分數除以2. – 2014-11-09 18:25:45

回答

7

您正在尋找的功能包含在scipy's spatial distance module中。

下面是一個如何使用它的例子。真正的魔法在squareform(pdist(points))

from scipy.spatial.distance import pdist, squareform 
import numpy as np 
import matplotlib.pyplot as plt 

points = np.random.uniform(-.5, .5, (1000,2)) 

# Compute the distance between each different pair of points in X with pdist. 
# Then, just for ease of working, convert to a typical symmetric distance matrix 
# with squareform. 
dists = squareform(pdist(points)) 

poi = points[4] # point of interest 
dist_min = .1 
close_points = dists[4] < dist_min 

print("There are {} other points within a distance of {} from the point " 
    "({:.3f}, {:.3f})".format(close_points.sum() - 1, dist_min, *poi)) 

There are 27 other points within a distance of 0.1 from the point (0.194, 0.160)

對於可視化的目的:

f,ax = plt.subplots(subplot_kw= 
    dict(aspect='equal', xlim=(-.5, .5), ylim=(-.5, .5))) 
ax.plot(points[:,0], points[:,1], 'b+ ') 
ax.plot(poi[0], poi[1], ms=15, marker='s', mfc='none', mec='g') 
ax.plot(points[close_points,0], points[close_points,1], 
    marker='o', mfc='none', mec='r', ls='') # draw all points within distance 

t = np.linspace(0, 2*np.pi, 512) 
circle = dist_min*np.vstack([np.cos(t), np.sin(t)]).T 
ax.plot((circle+poi)[:,0], (circle+poi)[:,1], 'k:') # Add a visual check for that distance 
plt.show() 

enter image description here