1
我想運行MLlib網格搜索的天真執行,但我對選擇'最佳'參數範圍有點困惑。顯然,我不想浪費太多的資源來組合參數,這可能不會給出改進的模型。您的經驗有何建議?網格搜索中的最佳參數範圍?
組參數範圍:
val intercept : List[Boolean] = List(false)
val classes : List[Int] = List(2)
val validate : List[Boolean] = List(true)
val tolerance : List[Double] = List(0.0000001 , 0.000001 , 0.00001 , 0.0001 , 0.001 , 0.01 , 0.1 , 1.0)
val gradient : List[Gradient] = List(new LogisticGradient() , new LeastSquaresGradient() , new HingeGradient())
val corrections : List[Int] = List(5 , 10 , 15)
val iters : List[Int] = List(1 , 10 , 100 , 1000 , 10000)
val regparam : List[Double] = List(0.0 , 0.0001 , 0.001 , 0.01 , 0.1 , 1.0 , 10.0 , 100.0)
val updater : List[Updater] = List(new SimpleUpdater() , new L1Updater() , new SquaredL2Updater())
執行網格搜索:
val combinations = for (a <- intercept;
b <- classes;
c <- validate;
d <- tolerance;
e <- gradient;
f <- corrections;
g <- iters;
h <- regparam;
i <- updater) yield (a,b,c,d,e,f,g,h,i)
for((interceptS , classesS , validateS , toleranceS , gradientS , correctionsS , itersS , regParamS , updaterS) <- combinations.take(3)) {
val lr : LogisticRegressionWithLBFGS = new LogisticRegressionWithLBFGS().
setIntercept(addIntercept=interceptS).
setNumClasses(numClasses=classesS).
setValidateData(validateData=validateS)
lr.
optimizer.
setConvergenceTol(tolerance=toleranceS).
setGradient(gradient=gradientS).
setNumCorrections(corrections=correctionsS).
setNumIterations(iters=itersS).
setRegParam(regParam=regParamS).
setUpdater(updater=updaterS)
}
因爲看起來你是第一個回答這個問題的人(在1.5年之後)......它對我來說足夠了,所以你把它當作答案發布;不管缺乏代表。爲正常評論。這個答案通常應該作爲評論來完成,因爲它不涉及代碼行。記住這一點。盡情享受吧;-) – ZF007