對不起,我是推薦系統的新手,但我用apache mahout lib寫了幾行代碼。那麼,我的數據集非常小,500x100與8102細胞已知。RMSE太小。推薦系統
因此,我的數據集實際上是來自「Yelp商業評級預測」競爭的Yelp數據集的一個子集。我只拿到了評級最高的100家餐廳,然後吸納了500位最活躍的顧客。
我創建了SVDRecommender,然後我評估了RMSE。結果約爲0.4 ...爲什麼它很小?也許我只是不明白的東西,我的數據集不是很稀疏,但後來我嘗試了更大,更稀疏的數據集和RMSE變得更小(約0.18)!有人能解釋我這種行爲嗎?
DataModel model = new FileDataModel(new File("datamf.csv"));
final RatingSGDFactorizer factorizer = new RatingSGDFactorizer(model, 20, 200);
final Factorization f = factorizer.factorize();
RecommenderBuilder builder = new RecommenderBuilder() {
public Recommender buildRecommender(DataModel model) throws TasteException {
//build here whatever existing or customized recommendation algorithm
return new SVDRecommender(model, factorizer);
}
};
RecommenderEvaluator evaluator = new RMSRecommenderEvaluator();
double score = evaluator.evaluate(builder,
null,
model,
0.6,
1);
System.out.println(score);