2014-04-09 92 views
1

我有PCA與3D numpy array作爲計算歐幾里得距離PCA在python

pcar =[[xa ya za] 
     [xb yb zb] 
     [xc yc zc] 
     . 
     . 
     [xn yn zn]] 

其中每行是一個點,我已經選擇從上方PCA任意二個隨機的行作爲一個羣集作爲

out_list=pcar[numpy.random.randint(0,pcar.shape[0],2)] 

它給出了2行的numpy數組。

我必須從每行out_list與pcar中的每行(點)找到歐幾里得距離,並將該pcar點添加到out_list集羣中的最近點。

+0

如果兩點具有相同的x,y,z會發生什麼? – Harpal

回答

2

有在Scipy一個非常快速的實施:

from scipy.spatial.distance import cdist, pdist 

cdist需要兩個向量,比如你的pchar,然後計算這些點之間的距離。 pdist只會給你那個矩陣的上三角形。

由於它們是在C或Fortran幕後執行的,因此它們非常高效。

+0

另請參閱以下內容:http://stackoverflow.com/questions/17527340/more-efficient-way-to-calculate-distance-in-numpy如果您的陣列變得巨大,那麼這些方法並不是最好的選擇。 – usethedeathstar

2

編輯 好的,我下載,安裝並教導自己numpy。這裏是一個numpy的版本

老答案

我知道你想有一個numpy的答案。我的numpy是生鏽的,但由於沒有其他答案,我以爲我會在Matlab中給你一個。它應該直接轉換。我假設問題是概念,而不是代碼。

請注意,有很多方法來剝皮這隻貓,我只是給一個。

工作numpy的版本

import numpy as np 

pcar = np.random.rand(10,3) 

out_list=pcar[np.random.randint(0,pcar.shape[0],2)] 

ol_1 = out_list[0,:] 
ol_2 = out_list[1,:] 

## Get the individual distances 
## The trick here is to pre-multiply the 1x3 ol vector with a row of 
## ones of size 10x1 to get a 10x3 array with ol replicated, so that it 
## can simply be subtracted 
d1 = pcar - ones(size(pcar,1))*ol_1 
d2 = pcar - ones(size(pcar,1))*ol_2 

##% Square them using an element-wise square 
d1s = np.square(d1) 
d2s = np.square(d2) 

##% Sum across the rows, not down columns 
d1ss = np.sum(d1s, axis=1) 
d2ss = np.sum(d2s, axis=1) 

##% Square root using an element-wise square-root 
e1 = np.sqrt(d1ss) 
e2 = np.sqrt(d2ss) 

##% Assign to class one or class two 
##% Start by assigning one to everything, then select all those where ol_2 
##% is closer and assign them the number 2 
assign = ones(size(e1,0)); 
assign[e2<e1] = 2 

##% Separate 
pcar1 = pcar[ assign==1, :] 
pcar2 = pcar[ assign==2, :] 

工作matlab版

close all 
clear all 

% Create 10 records each with 3 attributes 
pcar = rand(10, 3) 

% Pick two (normally at random of course) 
out_list = pcar(1:2, :) 

% Hard-coding this separately, though this can be done iteratively 
ol_1 = out_list(1,:) 
ol_2 = out_list(2,:) 

% Get the individual distances 
% The trick here is to pre-multiply the 1x3 ol vector with a row of 
% ones of size 10x1 to get a 10x3 array with ol replicated, so that it 
% can simply be subtracted 
d1 = pcar - ones(size(pcar,1), 1)*ol_1 
d2 = pcar - ones(size(pcar,1), 1)*ol_2 

% Square them using an element-wise square 
d1s = d1.^2 
d2s = d2.^2 

% Sum across the rows, not down columns 
d1ss = sum(d1s, 2) 
d2ss = sum(d2s, 2) 

% Square root using an element-wise square-root 
e1 = sqrt(d1ss) 
e2 = sqrt(d2ss) 

% Assign to class one or class two 
% Start by assigning one to everything, then select all those where ol_2 
% is closer and assign them the number 2 
assign = ones(length(e1),1); 
assign(e2<e1)=2 

% Separate 
pcar1 = pcar(assign==1, :) 
pcar2 = pcar(assign==2, :) 

% Plot 
plot3(pcar1(:,1), pcar1(:,2), pcar1(:,3), 'g+') 
hold on 
plot3(pcar2(:,1), pcar2(:,2), pcar2(:,3), 'r+') 
plot3(ol_1(1), ol_1(2), ol_1(3), 'go') 
plot3(ol_2(1), ol_2(2), ol_2(3), 'ro') 
+1

雖然這很有用,但它並不能回答OP的問題,因爲他們需要Python,但仍然付出很大努力 – EdChum

+1

@EdChum我意識到OP使用numpy。如果問題是這個概念,那麼在僞代碼中的答案將會很好。如果僞代碼可以很好地運行,爲什麼不使用Matlab,這是numpy的第一代堂兄弟?無論如何,我認爲它不能傷害。 – timbo

+2

@EdChum好吧,我提供了一個numpy版本。這是一個信譽stackoverflow和python,我可以下載,安裝和學習足夠的numpy來翻譯我的代碼約30米! – timbo

相關問題