1
你好我使用的KDD 1999數據集,我正在尋找應用樸素貝葉斯在它的matlab。我想知道的是KDD數據集數據的494021x42陣列,如果你注意到了樸素貝葉斯代碼「培訓」和「target_class」下面:Matlab樸素貝葉斯
training = [1;0;-1;-2;4;0]; % this is the sample data.
target_class = ['posi';'zero';'negi';'negi';'posi';'zero'];
% This should have the same number of rows as training data but why?
% Training and Testing the classifier (between positive and negative)
test = 10*randn(10,1) % this is for testing. I am generating random numbers.
class = classify(test,training, target_class, 'diaglinear')
% This command classifies the test data depening on the given training data using a Naive Bayes classifier
% diaglinear is for naive bayes classifier; there is also diagquadratic
我想知道什麼是「Target_class 「與kdd數據集的攻擊類型有關?
back dos
buffer_overflow u2r
ftp_write r2l
guess_passwd r2l
imap r2l
ipsweep probe
land dos
loadmodule u2r
multihop r2l
neptune dos
nmap probe
perl u2r
phf r2l
pod dos
portsweep probe
rootkit u2r
satan probe
smurf dos
spy r2l
teardrop dos
warezclient r2l
warezmaster r2l
或者是「測試」集中包含的colum頭文件的目標類?即
protocol_type: symbolic.
service: symbolic.
flag: symbolic.
src_bytes: continuous.
dst_bytes: continuous.
land: symbolic.
wrong_fragment: continuous.
奇怪,這是否意味着訓練集中攻擊類型的數量較少,我可能從結論中得不到任何有意義的結果?你會認爲測試數據會包含更少的數據,並且訓練包含更多的準確性。 – 2012-01-29 15:08:32
如果你今天訓練你的分類,你只能訓練到目前爲止所看到的攻擊。假設將來可能出現與現有攻擊相關的新攻擊是合理的。檢測這些也是你的任務! – 2012-01-29 15:14:26