2014-09-03 59 views
9

所以我建立星火1.0.0隱式反饋推薦的模型,我試圖按照他們有他們的協同過濾頁面上的例子: http://spark.apache.org/docs/latest/mllib-collaborative-filtering.html#explicit-vs-implicit-feedback星火MLlib - 協同過濾隱飼料

而且我甚至有的測試數據集裝起來它們在例如參考: http://codesearch.ruethschilling.info/xref/apache-foundation/spark/mllib/data/als/test.data

然而,當我嘗試運行隱式反饋模型: VAL阿爾法= 0.01 VAL模型= ALS.trainImplicit(評分,秩,numIterations,阿爾法)

(收視率從他們的數據集和秩= 10,正是收視率numIterations = 20),我收到以下錯誤:

scala> val model = ALS.trainImplicit(ratings, rank, numIterations, alpha) 
<console>:26: error: overloaded method value trainImplicit with alternatives: 
(ratings: org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating],rank: Int,iterations: Int)org.apache.spark.mllib.recommendation.MatrixFactorizationModel <and> 
(ratings: org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating],rank: Int,iterations: Int,lambda: Double,alpha: Double)org.apache.spark.mllib.recommendation.MatrixFactorizationModel <and> 
(ratings: org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating],rank: Int,iterations: Int,lambda: Double,blocks: Int,alpha: Double)org.apache.spark.mllib.recommendation.MatrixFactorizationModel <and> 
(ratings: org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating],rank: Int,iterations: Int,lambda: Double,blocks: Int,alpha: Double,seed: Long)org.apache.spark.mllib.recommendation.MatrixFactorizationModel 
cannot be applied to (org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating], Int, Int, Double) 
val model = ALS.trainImplicit(ratings, rank, numIterations, alpha) 

有趣的是,這種模式運行時沒有做trainImplicit就好了(即ALS.train)

回答

4

該示例似乎與實現不同步,因爲沒有帶有四個參數的trainImplicit超載 - 這是錯誤消息告訴您的。但是,如果你看一下Scala source code for ALS你會看到這三個參數超載在六個參數超載方面實現通過一些「幻數」:

def trainImplicit(ratings: RDD[Rating], rank: Int, iterations: Int) 
    : MatrixFactorizationModel = { 
    trainImplicit(ratings, rank, iterations, 0.01, -1, 1.0) 
} 

這表明,0.01是一個體面的默認值拉姆達。 (或許可以與更深入瞭解ML的人一起檢查)。這可能會給你足夠的信息來合理調用五個或六個參數過載。 (當然,如果你有足夠的知識挑更好的價值,這是偉大的!)

例如:

val model = ALS.trainImplicit(ratings, rank, numIterations, 0.01, alpha) 

val model = ALS.trainImplicit(ratings, rank, numIterations, 0.01, -1, alpha) 

最後,你可能沒有意識到,有相當不錯的API documentaiton for ALS

+0

完美的,'神奇數字'計算似乎工作得很好!非常感謝你的幫助!! – atellez 2014-09-03 20:18:52

+0

是的0.01對於lambda來說是一個很好的默認值。 – 2014-09-03 20:31:00