我已經使用Weka GUI來訓練和測試一個文件(做出預測),但是不能對API執行相同的操作。我得到的錯誤表示列車和測試文件中有不同數量的屬性。在GUI中,這可以通過檢查「輸出預測」來解決。Weka輸出預測
如何使用API做類似的事情?你知道那裏有什麼樣嗎?
import weka.classifiers.bayes.NaiveBayes;
import weka.classifiers.meta.FilteredClassifier;
import weka.classifiers.trees.J48;
import weka.core.Instances;
import weka.core.converters.ConverterUtils.DataSource;
import weka.filters.Filter;
import weka.filters.unsupervised.attribute.NominalToBinary;
import weka.filters.unsupervised.attribute.Remove;
public class WekaTutorial
{
public static void main(String[] args) throws Exception
{
DataSource trainSource = new DataSource("/tmp/classes - edited.arff"); // training
Instances trainData = trainSource.getDataSet();
DataSource testSource = new DataSource("/tmp/classes_testing.arff");
Instances testData = testSource.getDataSet();
if (trainData.classIndex() == -1)
{
trainData.setClassIndex(trainData.numAttributes() - 1);
}
if (testData.classIndex() == -1)
{
testData.setClassIndex(testData.numAttributes() - 1);
}
String[] options = weka.core.Utils.splitOptions("weka.filters.unsupervised.attribute.StringToWordVector -R first-last -W 1000 -prune-rate -1.0 -N 0 -stemmer weka.core.stemmers.NullStemmer -M 1 "
+ "-tokenizer \"weka.core.tokenizers.WordTokenizer -delimiters \" \\r\\n\\t.,;:\\\'\\\"()?!\"");
Remove remove = new Remove();
remove.setOptions(options);
remove.setInputFormat(trainData);
NominalToBinary filter = new NominalToBinary();
NaiveBayes nb = new NaiveBayes();
FilteredClassifier fc = new FilteredClassifier();
fc.setFilter(filter);
fc.setClassifier(nb);
// train and make predictions
fc.buildClassifier(trainData);
for (int i = 0; i < testData.numInstances(); i++)
{
double pred = fc.classifyInstance(testData.instance(i));
System.out.print("ID: " + testData.instance(i).value(0));
System.out.print(", actual: " + testData.classAttribute().value((int) testData.instance(i).classValue()));
System.out.println(", predicted: " + testData.classAttribute().value((int) pred));
}
}
}
錯誤:
Exception in thread "main" java.lang.IllegalArgumentException: Src and Dest differ in # of attributes: 2 != 17152
這不是對GUI的問題。
你申請一些過濾器?如果是的話,這可能會幫助你 https://stackoverflow.com/questions/33337500/trained-and-test-data-have-different-number-of-attributes-that-gave-an-error-tr – dram