2012-02-15 46 views
2

我的程序使用來自用戶的一定數量的簇的K均值聚類。對於這個k = 4,但我想通過matlabs樸素貝葉斯分類器運行聚類信息。MATLAB - 分類輸出

有沒有辦法將簇分割並將它們饋入matlab中不同的樸素分類器?

樸素貝葉斯:

class = classify(test,training, target_class, 'diaglinear'); 

K均值:

%% generate sample data 
K = 4; 
numObservarations = 5000; 
dimensions = 42; 
%% cluster 
opts = statset('MaxIter', 500, 'Display', 'iter'); 
[clustIDX, clusters, interClustSum, Dist] = kmeans(data, K, 'options',opts, ... 
'distance','sqEuclidean', 'EmptyAction','singleton', 'replicates',3); 
%% plot data+clusters 
figure, hold on 
scatter3(data(:,1),data(:,2),data(:,3), 5, clustIDX, 'filled') 
scatter3(clusters(:,1),clusters(:,2),clusters(:,3), 100, (1:K)', 'filled') 
hold off, xlabel('x'), ylabel('y'), zlabel('z') 
%% plot clusters quality 
figure 
[silh,h] = silhouette(data, clustIDX); 
avrgScore = mean(silh); 
%% Assign data to clusters 
% calculate distance (squared) of all instances to each cluster centroid 
D = zeros(numObservarations, K);  % init distances 
for k=1:K 
%d = sum((x-y).^2).^0.5 
D(:,k) = sum(((data - repmat(clusters(k,:),numObservarations,1)).^2), 2); 
end 
% find for all instances the cluster closet to it 
[minDists, clusterIndices] = min(D, [], 2); 
% compare it with what you expect it to be 
sum(clusterIndices == clustIDX) 

類似k個簇outputing到格式K1,K2,K3然後將具有幼稚分類挑選那些起來,而不是測試它會是k1,k2 ..等

class = classify(k1,training, target_class, 'diaglinear'); 

但我只是不知道如何發送k個簇的輸出在m atlab的某種格式? (真正的新本程序)

編輯

training = [1;0;-1;-2;4;0]; % this is the sample data. 
target_class = ['posi';'zero';'negi';'negi';'posi';'zero'];% This should have the same number of rows as training data. The elements and the class on the same row should correspond. 
% target_class are the different target classes for the training data; here 'positive' and 'negetive' are the two classes for the given training data 

% Training and Testing the classifier (between positive and negative) 
test = 10*randn(10,1) % this is for testing. I am generating random numbers. 
class = classify(test,training, target_class, 'diaglinear') % This command classifies the test data depening on the given training data using a Naive Bayes classifier 

% diaglinear is for naive bayes classifier; there is also diagquadratic 

回答

1

試試這個:

% create 100 random points (this is the training data) 
X = rand(100,3); 

% cluster into 5 clusters 
K = 5; 
[IDX, C] = kmeans(X, K); 

% now let us say you have new data and you want 
% to classify it based on the training: 
SAMPLE = rand(10,3); 
CLASS = classify(SAMPLE,X,IDX); 

如果你只是想篩選出集羣之一了,你可以做數據的類似的東西:

K1 = X(IDX==1) 

希望有幫助..

+0

Zenpoy感謝一堆!但是,當你使用SAMLE作爲測試數據時你不會使用K1?還是我混淆了測試,培訓,target_class?我認爲target_class應該是每個分類行的標籤,訓練將是用於學習如何識別的特定數據,並且測試數據將成爲確定您的需求是否可以分類的第一個樣本數據? (即我的具體問題集羣之一) – 2012-02-19 16:15:27

+0

我不知道,但我認爲你困惑的東西。根據文檔'help classify':CLASS = classify(SAMPLE,TRAINING,GROUP)將SAMPLE中的每行數據分類到TRAINING中的一個組中。 SAMPLE和TRAINING必須是具有相同列數的矩陣。 GROUP是TRAINING的分組變量。其唯一值定義組,每個元素定義TRAINING對應的行屬於哪個組。 GROUP可以是分類變量,數字向量,字符串數組或字符串的單元數組。 – zenpoy 2012-02-19 18:03:07

+0

啊等待有多種選擇是的,你可以將他們分組,但你也可以單獨分類他們。看到我上面的編輯代碼。請注意,我將訓練數據進行訓練,並使用目標課對它們進行分類。然後我用隨機數「測試」分類器。輸出與分類的正數和負數分類。在我的例子中,我將簡單地使用我的一個羣集作爲測試機制。 – 2012-02-20 10:28:33