所以我使用的Weka機器學習庫的Java API和我有以下代碼:的Java秧雞stringtowordvector不計字OCCURENCES正確
String html = "repeat repeat repeat";
Attribute input = new Attribute("html",(FastVector) null);
FastVector inputVec = new FastVector();
inputVec.addElement(input);
Instances htmlInst = new Instances("html",inputVec,1);
htmlInst.add(new Instance(1));
htmlInst.instance(0).setValue(0, html);
StringToWordVector filter = new StringToWordVector();
filter.setUseStoplist(true);
filter.setInputFormat(htmlInst);
Instances dataFiltered = Filter.useFilter(htmlInst, filter);
Instance last = dataFiltered.lastInstance();
System.out.println(last);
雖然StringToWordVector應該在字符串中數字出現次數,而不是「重複」一詞計數3次,計數僅作爲1發生1
我在做什麼錯了?