這是我第二篇關於weka使用情況的帖子(第一篇帖子是here)。我成功地使用TextDirectoryLoader爲Weka提供了培訓和樣本測試數據。很棒。現在我想將它移到生產環境中,所以要從MySQL表中檢索要分類的數據。這是我如何做它:Weka來自MySql數據庫的培訓數據
TextDirectoryLoader loader = new TextDirectoryLoader();
loader.setDirectory(new File("c:/Users/Yehia A.Salam/Desktop/dd/training-data"));
Instances dataRaw = loader.getDataSet();
StringToWordVector filter = new StringToWordVector();
filter.setInputFormat(dataRaw);
Instances dataTraining = Filter.useFilter(dataRaw, filter);
// Create test data instances[this works, but the sample data now needs to come frm the db instead, see below]
//loader.setDirectory(new File("c:/Users/Yehia A.Salam/Desktop/dd/test-data"));
//dataRaw = loader.getDataSet();
//Instances dataTest = Filter.useFilter(dataRaw, filter);
InstanceQuery query = new InstanceQuery();
query.setUsername("myusername");
query.setPassword("mypassword");
String sql = "SELECT d.desc FROM deals d WHERE d.j48 = 1";
query.setQuery(sql);
Instances dataTest = Filter.useFilter(query.retrieveInstances(), filter);
// Classify
J48 model = new J48();
model.buildClassifier(dataTraining);
for (int i = 0; i < dataTest.numInstances(); i++) {
dataTest.instance(i).setClassMissing();
double cls = model.classifyInstance(dataTest.instance(i));
dataTest.instance(i).setClassValue(cls);
System.out.println(cls + " -> " + dataTest.instance(i).classAttribute().value((int) cls));
}
不幸的是這是行不通的,秧雞意外停止在這條線:
Instances dataTest = Filter.useFilter(query.retrieveInstances(), filter);
所以我想我的問題是如何改造這部分
// Create test data instances[this works, but the sample data now needs to come frm the db instead, see below]
//loader.setDirectory(new File("c:/Users/Yehia A.Salam/Desktop/dd/test-data"));
//dataRaw = loader.getDataSet();
//Instances dataTest = Filter.useFilter(dataRaw, filter);
到SQL基於數據
InstanceQuery query = new InstanceQuery();
query.setUsername("myusername");
query.setPassword("mypassword");
String sql = "SELECT d.desc FROM deals d WHERE d.j48 = 1";
query.setQuery(sql);
Instances dataTest = Filter.useFilter(query.retrieveInstances(), filter);
請注意,數據庫連接沒有問題,我確實獲得了正確數量的實例。
欣賞幫助,非常接近。
weka停止「意外」的堆棧跟蹤是什麼?你調查了'query.retrieveInstances()'的輸出嗎? – 2013-03-20 10:46:12
你確定你的SQL:'SELECT d.desc FROM deals d WHERE d.j48 = 1'?我會期望像'SELECT d.desc FROM deal AS d WHERE d.j48 = 1'。 – 2013-03-21 09:39:11
@JanEglinger試圖添加AS但沒有運氣,我檢查了query.retrieveInstances()的錯誤,它的o =(java.lang.ArrayIndexOutOfBoundsException)java.lang.ArrayIndexOutOfBoundsException:1 – 2013-03-25 21:51:31