6
我有兩個數據集羣,每個集羣都有x,y(座標)和一個知道它的類型的值(1 class1,2 class 2)。我繪製了這些數據,但是我會喜歡用邊界(視覺)分割這些類。做這樣的事情有什麼功能。我嘗試了輪廓,但沒有幫助!在matlab中將數據可視化地分爲兩類
我有兩個數據集羣,每個集羣都有x,y(座標)和一個知道它的類型的值(1 class1,2 class 2)。我繪製了這些數據,但是我會喜歡用邊界(視覺)分割這些類。做這樣的事情有什麼功能。我嘗試了輪廓,但沒有幫助!在matlab中將數據可視化地分爲兩類
考慮這個classification問題(使用Iris dataset):
正如你所看到的,除了用於您知道邊界的方程事先容易分離集羣,尋找邊界不是簡單的任務...
一個想法是使用discriminant analysis功能classify找到邊界(你有線性和二次邊界之間進行選擇)。
以下是說明該過程的完整示例。該代碼需要統計工具箱:
%# load Iris dataset (make it binary-class with 2 features)
load fisheriris
data = meas(:,1:2);
labels = species;
labels(~strcmp(labels,'versicolor')) = {'non-versicolor'};
NUM_K = numel(unique(labels)); %# number of classes
numInst = size(data,1); %# number of instances
%# visualize data
figure(1)
gscatter(data(:,1), data(:,2), labels, 'rb', '*o', ...
10, 'on', 'sepal length', 'sepal width')
title('Iris dataset'), box on, axis tight
%# params
classifierType = 'quadratic'; %# 'quadratic', 'linear'
npoints = 100;
clrLite = [1 0.6 0.6 ; 0.6 1 0.6 ; 0.6 0.6 1];
clrDark = [0.7 0 0 ; 0 0.7 0 ; 0 0 0.7];
%# discriminant analysis
%# classify the grid space of these two dimensions
mn = min(data); mx = max(data);
[X,Y] = meshgrid(linspace(mn(1),mx(1),npoints) , linspace(mn(2),mx(2),npoints));
X = X(:); Y = Y(:);
[C,err,P,logp,coeff] = classify([X Y], data, labels, classifierType);
%# find incorrectly classified training data
[CPred,err] = classify(data, data, labels, classifierType);
bad = ~strcmp(CPred,labels);
%# plot grid classification color-coded
figure(2), hold on
image(X, Y, reshape(grp2idx(C),npoints,npoints))
axis xy, colormap(clrLite)
%# plot data points (correctly and incorrectly classified)
gscatter(data(:,1), data(:,2), labels, clrDark, '.', 20, 'on');
%# mark incorrectly classified data
plot(data(bad,1), data(bad,2), 'kx', 'MarkerSize',10)
axis([mn(1) mx(1) mn(2) mx(2)])
%# draw decision boundaries between pairs of clusters
for i=1:NUM_K
for j=i+1:NUM_K
if strcmp(coeff(i,j).type, 'quadratic')
K = coeff(i,j).const;
L = coeff(i,j).linear;
Q = coeff(i,j).quadratic;
f = sprintf('0 = %g + %g*x + %g*y + %g*x^2 + %g*x.*y + %g*y.^2',...
K,L,Q(1,1),Q(1,2)+Q(2,1),Q(2,2));
else
K = coeff(i,j).const;
L = coeff(i,j).linear;
f = sprintf('0 = %g + %g*x + %g*y', K,L(1),L(2));
end
h2 = ezplot(f, [mn(1) mx(1) mn(2) mx(2)]);
set(h2, 'Color','k', 'LineWidth',2)
end
end
xlabel('sepal length'), ylabel('sepal width')
title(sprintf('accuracy = %.2f%%', 100*(1-sum(bad)/numInst)))
hold off
+1 ....漂亮! – Jacob 2009-12-26 03:54:26
@Amro - 這只是我,還是第二個截圖失蹤? – Shai 2012-12-16 10:43:30
@Shai:不只是你,有時上傳到imageshack的舊圖片往往會因爲某種原因而消失......反正我用新鮮的圖片更新了這個例子:) – Amro 2012-12-16 14:57:26