2012-12-29 37 views
0

我想用木蘭分類一些數據。但我得到一個例外:花木蘭分類數據 - 不工作

mulan.data.DataLoadException: Error creating Instances data from supplied Reader data source 
at mulan.data.MultiLabelInstances.loadInstances(MultiLabelInstances.java:469) 
at mulan.data.MultiLabelInstances.loadInstances(MultiLabelInstances.java:458) 
at mulan.data.MultiLabelInstances.<init>(MultiLabelInstances.java:168) 

主要功能是從mulan.examples.TrainTestExperiment

public class TrainTestExperiment { 

    public static void main(String[] args) { 
     try { 
      String path = Utils.getOption("path", args); // e.g. -path dataset/ 
      String filestem = Utils.getOption("filestem", args); // e.g. -filestem emotions 
      String percentage = Utils.getOption("percentage", args); // e.g. -percentage 50 (for 50%) 

      System.out.println("Loading the dataset"); 
      MultiLabelInstances mlDataSet = new MultiLabelInstances(path + filestem + ".arff", path + filestem + ".xml"); 

      // split the data set into train and test 
      Instances dataSet = mlDataSet.getDataSet(); 
      RemovePercentage rmvp = new RemovePercentage(); 
      rmvp.setInvertSelection(true); 
      rmvp.setPercentage(Double.parseDouble(percentage)); 
      rmvp.setInputFormat(dataSet); 
      Instances trainDataSet = Filter.useFilter(dataSet, rmvp); 

      rmvp = new RemovePercentage(); 
      rmvp.setPercentage(Double.parseDouble(percentage)); 
      rmvp.setInputFormat(dataSet); 
      Instances testDataSet = Filter.useFilter(dataSet, rmvp); 

      MultiLabelInstances train = new MultiLabelInstances(trainDataSet, path + filestem + ".xml"); 
      MultiLabelInstances test = new MultiLabelInstances(testDataSet, path + filestem + ".xml"); 

      Evaluator eval = new Evaluator(); 
      Evaluation results; 

      Classifier brClassifier = new NaiveBayes(); 
      BinaryRelevance br = new BinaryRelevance(brClassifier); 
      br.setDebug(true); 
      br.build(train); 
      results = eval.evaluate(br, test); 
      System.out.println(results); 
     } catch (Exception e) { 
      e.printStackTrace(); 
     } 
    } 
} 

至於數據格式,我有一個叫維稱號,160名作。

數據文件根據arff格式進行格式化。

部分文字爲中文。

任何幫助表示感謝。

最好的問候

回答