2012-08-28 20 views
0

我需要使用Weka的LibSVM實現RSS源中關鍵字頻率上的SVM分類器,以將這些源分類爲目標類別。但是我不確定給出我的數據要運行哪個版本。在Weka中運行哪個版本的SVM?

我.arff文件通常包含以下數據:

@attribute Keyword_1_nasa_Frequency numeric 
@attribute Keyword_2_fish_Frequency numeric 
@attribute Keyword_3_kill_Frequency numeric 
@attribute Keyword_4_show_Frequency numeric 
… 
@attribute RSSFeedCategoryDescription {BFE,FCL,F,M, NCA, SNT,S} 

@data 
0,0,0,34,0,0,0,0,0,40,0,0,0,0,0,0,0,0,0,0,24,0,0,0,0,13,0,0,0,0,0,0,0,0,0,0,0,0, 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,BFE 
0,0,0,12,0,0,0,0,0,20,0,0,0,0,0,0,0,0,0,0,25,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,BFE 
0,0,0,10,0,0,0,0,0,11,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,BFE 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,BFE 
… 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,FCL 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,F 
… 
20,0,64,19,0,162,0,0,36,72,179,24,24,47,24,40,0,48,0,0,0,97,24,0,48,205,143,62,7 
8,0,0,216,0,36,24,24,0,0,24,0,0,0,0,140,24,0,0,0,0,72,176,0,0,144,48,0,38,0,284, 
221,72,0,72,0,SNT 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,SNT 
0,0,0,0,0,0,11,0,0,0,0,0,0,0,19,0,0,0,0,0,0,0,0,0,0,10,0,0,0,0,0,0,0,0,0,0,0,0,0 
,0,0,0,0,0,0,0,0,0,17,0,0,0,0,0,0,0,0,0,0,0,0,0,20,0,S 

等等:總共有570行,其中每一個都是在每天的飼料用 頻率的關鍵字中包含的。在這種情況下,共有57個記錄供 10天共計570個記錄進行分類。每個關鍵字的前綴爲 ,並帶有替代號碼,後綴爲「頻率」。

但在其他情況下,我已經使用布爾值的頻率,使上述第一行是:

假的,假的,假的,真的,假的......,BFE

而且依此類推,其中34是正確的,因爲滿足了閾值,其他因爲閾值未達到而錯誤。

據我可以確定,有在Weka中三種類型的SVM的,但有誰能夠告訴我,我應該用我上面的數據可以使用這類型?

回答

0

我建議所有三個核心類型的嘗試,並確定哪一個是適合您的訓練和驗證數據的最好(做圖),然後就繼續使用該訓練模型來預測新的投入。

在weka中,您可以保存模型以備將來使用。

相關問題