2016-11-13 34 views
2

我得到以下發現:org.apache.spark.sql.Dataset [(雙人間,雙人間)]要求:org.apache.spark.rdd.RDD [(雙人間,雙人間)]

found : org.apache.spark.sql.Dataset[(Double, Double)] 
required: org.apache.spark.rdd.RDD[(Double, Double)] 
    val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel) 
錯誤

在下面的代碼:

val testScoreAndLabel = testResults. 
    select("Label","ModelProbability"). 
    map{ case Row(l:Double,p:Vector) => (p(1),l) } 
val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel) 

從錯誤似乎testScoreAndLabelsql.Dataset類型,但BinaryClassificationMetrics期望一個RDD

如何將sql.Dataset轉換爲RDD

回答

1

我會做這樣的事情

val testScoreAndLabel = testResults. 
    select("Label","ModelProbability"). 
    map{ case Row(l:Double,p:Vector) => (p(1),l) } 

現在只是做testScoreAndLabel.rdd

val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel.rdd) 

API Doc

轉換 testScoreAndLabel到RDD