2012-07-12 248 views
0

當應用這種方法:矩陣尺寸必須同意

%% When an outlier is considered to be more than three standard deviations away from the mean, use the following syntax to determine the number of outliers in each column of the count matrix: 

mu = mean(data) 
sigma = std(data) 
[n,p] = size(data); 
% Create a matrix of mean values by replicating the mu vector for n rows 
MeanMat = repmat(mu,n,1); 
% Create a matrix of standard deviation values by replicating the sigma vector for n rows 
SigmaMat = repmat(sigma,n,1); 
% Create a matrix of zeros and ones, where ones indicate the location of outliers 
outliers = abs(data - MeanMat) > 3*SigmaMat; 
% Calculate the number of outliers in each column 
nout = sum(outliers) 
% To remove an entire row of data containing the outlier 
data(any(outliers,2),:) = []; %% this line 

最後一行從我的數據集移除一定數量的觀測(行)。後來我得到不過一個問題在我的計劃,因爲我已經手動陳述意見(行)的數量爲1000

%% generate sample data 
K = 6; 
numObservarations = 1000; 
dimensions = 3; 

如果我改變numObservarationsdata我得到一個標量輸出錯誤但是如果我不改變它,由於行的不匹配我得到這個錯誤的號碼:

??? Error using ==> minus 
Matrix dimensions must agree. 

Error in ==> datamining at 106 
    D(:,k) = sum(((data - 
    repmat(clusters(k,:),numObservarations,1)).^2), 2); 

有沒有一種方法來設置numObservarations因此它會自動檢測data行和產出量作爲只是一個數字?

回答

5

我一定是誤解了一些東西。據我所知,這應該是足夠的:

numObservations = size(data, 1);