1
我有一個模型本身(非Hadoop的):亨利馬烏:正火UserSimilarity距離
DataModel data = new FileDataModel(new File("file.csv"));
UserSimilarity userSimilarity = new PearsonCorrelationSimilarity(dataModel);
userSimilarity.setPreferenceInferrer(new AveragingPreferenceInferrer(data));
UserNeighborhood userNeighborhood = new NearestNUserNeighborhood(1, userSimilarity, data);
userSimilarity不是[0100]之間的歸一化,例如,因此,如果想以顯示它給最終用戶,我使用以下溶液:
long maxSim = userSimilarity.userSimilarity(userId1, userNeighborhood.getUserNeighborhood(userId1)[0]);
long finalSimilarity = Math.min(100, Math.max((int) Math.ceil(100 * userSimilarity.userSimilarity(userId1, userId2)/maxSim), 0))
我觀察到的性能問題與該(各種秒爲每個用戶),有另一種可能,或以具有分鐘(相似性)= 0和max(相似性)= 100爲最快的方式每個給定的用戶?