0
當應用這種方法:矩陣尺寸必須同意
%% When an outlier is considered to be more than three standard deviations away from the mean, use the following syntax to determine the number of outliers in each column of the count matrix:
mu = mean(data)
sigma = std(data)
[n,p] = size(data);
% Create a matrix of mean values by replicating the mu vector for n rows
MeanMat = repmat(mu,n,1);
% Create a matrix of standard deviation values by replicating the sigma vector for n rows
SigmaMat = repmat(sigma,n,1);
% Create a matrix of zeros and ones, where ones indicate the location of outliers
outliers = abs(data - MeanMat) > 3*SigmaMat;
% Calculate the number of outliers in each column
nout = sum(outliers)
% To remove an entire row of data containing the outlier
data(any(outliers,2),:) = []; %% this line
最後一行從我的數據集移除一定數量的觀測(行)。後來我得到不過一個問題在我的計劃,因爲我已經手動陳述意見(行)的數量爲1000
%% generate sample data
K = 6;
numObservarations = 1000;
dimensions = 3;
如果我改變numObservarations
到data
我得到一個標量輸出錯誤但是如果我不改變它,由於行的不匹配我得到這個錯誤的號碼:
??? Error using ==> minus
Matrix dimensions must agree.
Error in ==> datamining at 106
D(:,k) = sum(((data -
repmat(clusters(k,:),numObservarations,1)).^2), 2);
有沒有一種方法來設置numObservarations
因此它會自動檢測data
行和產出量作爲只是一個數字?