2014-03-29 40 views
2

我正在使用一個字符串向量過濾器來將我的arff轉換爲矢量格式。Stringtoword矢量無法正常工作weka

但它拋出一個異常

weka.core.WekaException: weka.classifiers.bayes.NaiveBayesMultinomialUpdateable: Not enough training instances with class labels (required: 1, provided: 0)! 

我試圖使用相同的秧雞探險家和它工作得很好。

這是我的代碼

ArffLoader loader = new ArffLoader(); 
    loader.setFile(new File("valid file")); 
    Instances structure = loader.getStructure(); 
    structure.setClassIndex(0); 

    // train NaiveBayes 
    NaiveBayesMultinomialUpdateable n = new NaiveBayesMultinomialUpdateable(); 
    FilteredClassifier f = new FilteredClassifier(); 
    StringToWordVector s = new StringToWordVector(); 

    f.setFilter(s); 
    f.setClassifier(n); 

    f.buildClassifier(structure); 
    Instance current; 
    while ((current = loader.getNextInstance(structure)) != null) 
     n.updateClassifier(current); 

    // output generated model 
    System.out.println(n); 

我曾嘗試另一個例子,但它仍然無法正常工作

ArffLoader loader = new ArffLoader(); 
    loader.setFile(new File("valid file")); 

    Instances structure = loader.getStructure(); 


    // train NaiveBayes 
    NaiveBayesMultinomialUpdateable n = new NaiveBayesMultinomialUpdateable(); 
    FilteredClassifier f = new FilteredClassifier(); 
    StringToWordVector s = new StringToWordVector(); 
    s.setInputFormat(structure); 
    Instances struct = Filter.useFilter(structure, s); 

    struct.setClassIndex(0); 
    System.out.println(struct.numAttributes()); // only gives 2 or 1 attributes 




    n.buildClassifier(struct); 
    Instance current; 
    while ((current = loader.getNextInstance(struct)) != null) 
     n.updateClassifier(current); 

    // output generated model 
    System.out.println(n); 

印刷屬性的數量始終是2或1

似乎到字詞向量的字符串未按預期工作

原始文件夾:https://www.dropbox.com/sh/cma4hbe2r96ul1c/GL2wNdeVUz

轉換爲ARFF:https://www.dropbox.com/s/efle6ci4lb5riq7/test1.arff

+1

你能張貼一個arff文件的樣本嗎? – FromWhereToWhere

+0

@FromWhereToWhere你好我已經更新了文件 – aceminer

回答

1

根據您的ARFF,類似乎是在這兩個屬性的第二個,所以問題就在這裏:

struct.setClassIndex(0); 

嘗試

struct.setClassIndex(1); 

更新:我對第一個示例做了此更改,它不會例外,並且會打印出來:

The independent probability of a class 
-------------------------------------- 
oil spill 40.0 
police 989.0 

The probability of a word given the class 
----------------------------------------- 
     oil spill police 
class Infinity Infinity  
+0

它仍然給我的java.lang.IllegalArgumentException異常:無效的類索引︰1 – aceminer

+1

@aceminer請檢查更新的答案。我用給定的test1.arff和Weka測試3.6.10 – FromWhereToWhere

+0

似乎weka 3.7.10不能正常工作,但weka 3.6.10正常工作 – aceminer