0

我想建立一個簡單的線性模型來預測標籤值使用LinearRegressionWithSGD。 我轉換的數據集獲得的功能和標籤,再轉換爲標記點做迴歸錯誤:重載的方法值「預測」替代/雙不參數

val train = dftrain.withColumn("label", dftrain("col2")).select("features", "label") 
val test = dftest.withColumn("label", dftest("col2")).select("features", "label") 

val realout = train.rdd.map(row => LabeledPoint(row.getAs[Double]("label"),DenseVector.fromML(row.getAs[org.apache.spark.ml.linalg.DenseVector]("features")))) 
val realout1 = test.rdd.map(row => LabeledPoint(row.getAs[Double]("label"),DenseVector.fromML(row.getAs[org.apache.spark.ml.linalg.DenseVector]("features")))) 

現在我擬合模型

val numIterations = 100 
val stepSize = 0.00000001 
//fitting the model with converted Labeled points Train Data 
val model = LinearRegressionWithSGD.train(realout, numIterations, stepSize) 
17/08/09 12:16:15 WARN LinearRegressionWithSGD: The input data is not directly c 
    ached, which may hurt performance if its parent RDDs are also uncached. 
    17/08/09 12:16:17 WARN BLAS: Failed to load implementation from: com.github.fomm 
    il.netlib.NativeSystemBLAS 
    17/08/09 12:16:17 WARN BLAS: Failed to load implementation from: com.github.fomm 
    il.netlib.NativeRefBLAS 
    17/08/09 12:16:17 WARN LinearRegressionWithSGD: The input data was not directly 
    cached, which may hurt performance if its parent RDDs are also uncached. 
    model: org.apache.spark.mllib.regression.LinearRegressionModel = org.apache.spar 
    k.mllib.regression.LinearRegressionModel: intercept = 0.0, numFeatures = 1 

它給了我一些警告和它也給Intercept作爲0.0,我不覺得它是正確的。但是當我預測模型時,它會引發錯誤。

val prediction = model.predict(realout1) 

<console>:98: error: overloaded method value predict with alternatives: 
    (testData: org.apache.spark.api.java.JavaRDD[org.apache.spark.mllib.linalg.Vec 
tor])org.apache.spark.api.java.JavaRDD[Double] <and> 
    (testData: org.apache.spark.mllib.linalg.Vector)Double <and> 
    (testData: org.apache.spark.rdd.RDD[org.apache.spark.mllib.linalg.Vector])org. 
apache.spark.rdd.RDD[Double] 
cannot be applied to (org.apache.spark.rdd.RDD[org.apache.spark.mllib.regressio 
n.LabeledPoint]) 
     val prediction = model.predict(realout1) 
          ^

另外,如果我這樣做,從here

// Evaluate model on training examples and compute training error 
val valuesAndPreds = realout.map { point => val prediction = model.predict(point.features) (point.label, prediction) } 

<console>:90: error: Double does not take parameters 
     val valuesAndPreds = realout.map { point => val prediction = model.predic 
t(point.features) (point.label, prediction) } 

       ^

相信的步驟是正確的。但我有選擇性地或雙不知道爲什麼它顯示重載方法預測值不帶參數

回答

0
val prediction = model.predict(realout1.map(_.features)); 

這個工作正常。但我不知道這個是多少正確的。任何建議表示讚賞。謝謝。

相關問題