基於訓練集的數據分類

我有一些需要分類的數據。我試過使用分類功能described here。基於訓練集的數據分類

我的示例是一個有1列和382行的矩陣。

我的訓練是一個有1列和2行的矩陣。

分組導致了我的問題。我寫過：grouping = [a,b];其中a是一個類別，b是另一個類別。

這給我的錯誤：

Undefined function or variable 'a'. 
Error in discrimtrialab (line 89) 
grouping = [a,b];

而且這一點，我怎麼分類的一組，即。不只是訓練的確切價值？

這裏是我的代碼：

a = -0.09306:0.0001:0.00476; 
b = -0.02968:0.0001:0.01484; 

%training = groups (odour index) 

training = [-0.09306:0.00476; -0.02968:0.01484;]; 

%grouping variable 

group = [a,b] 

%classify 

[class, err] = classify(sample, training, group, 'linear'); 

class(a)

（注意 - 上面有一些這方面的處理，但它是無關的問題）

來源

2013-07-31 user2587726

你對class（a）和class （b）'？ – Schorsch

同樣的錯誤。錯誤是在分類函數完成之前引起的。 – user2587726

你想'a'是一個字符串嗎？像在'a'中一樣？或者是一個包含類別的變量？ – Schorsch

從文檔：

class = classify(sample,training,group) classifies each row of the data in sample into one of the groups in training. (See Grouped Data.) sample and training must be matrices with the same number of columns. group is a grouping variable for training. Its unique values define groups; each element defines the group to which the corresponding row of training belongs.

那是，「組」必須具有與訓練相同的行數。從在幫助的例子：

load fisheriris 
SL = meas(51:end,1); 
SW = meas(51:end,2); 
group = species(51:end);

SL & SW是100×1矩陣以用於訓練（在每個100個樣品製成兩種不同的測量）。組是100×1單元格的字符串數組，指示每個測量屬於哪個物種。它也可以是字符數組或簡單的數字列表（1,2,3），其中每個數字表示不同的組，但它必須有100行。

例如如果你的訓練矩陣是加倍，其中所述第一50是該屬於「A」值的100×1矩陣和第二50被認爲屬於「B」您的組矩陣值可以是：

group = [repmat('a',50,1);repmat('b',50,1)];

但是，如果所有的「羣組」只是非重疊範圍爲在這裏評論說：

What I want classify to do is work out whether or not each number in "sample" is type A, ie, in the range -0.04416 +/- 0.0163, or type B, with the range -0.00914 +/- 0.00742

那麼你並不真正需要的分類。要提取樣品，其相等的值加上或減去一些公差值：

sample1 = sample(abs(sample-value)<tol);

最新評論後ETA：「集團」可以是一個數字矢量，所以如果你有一個訓練數據集你需要根據某些變量的範圍進行分組，然後像（這個代碼沒有選中，但基本原則應該是健全的）：

%presume "data" is our training data (381 x 3) and "sample" (n x 2) is the data we want to classify 
group = zeros(length(data),1); %empty matrix 

% first column is variable for grouping, second + third are data equivalent to the entries in "sample". 
training = data(:,2:3); 

% find where data(:,1) meets whatever our requirements are and label groups with numbers 
group(data(:,1)<3)=1; % group "1" is wherever first column is below 3 
group(data(:,1)>7)=2; % group "2" is wherever first column is above 7 
group(group==0)=NaN; % set any remaining data to NaN 

%now we classify "sample" based on "data" which has been split into "training" and "group" variables 
class = classify(sample, training, group);

來源

2013-08-02 10:37:17 nkjt

那麼，我是否必須預先確定每個值應該屬於哪個類別？這個功能有什麼意義？ – user2587726

您必須預先確定「培訓」值所屬的類別。然後會告訴您「樣本」值屬於哪些類別。（判別分析）。例如如果它來自魚類數據集，則樣本將來自未知物種的測量結果，訓練將來自已知物種的測量結果，並且組將是說明訓練值來自何種物種的數據。 – nkjt

你可以發佈我需要用來預先確定訓練值的類別的代碼嗎？ – user2587726

基於訓練集的數據分類

回答

相關問題