發現：org.apache.spark.sql.Dataset [（雙人間，雙人間）]要求：org.apache.spark.rdd.RDD [（雙人間，雙人間）]

我得到以下發現：org.apache.spark.sql.Dataset [（雙人間，雙人間）]要求：org.apache.spark.rdd.RDD [（雙人間，雙人間）]

found : org.apache.spark.sql.Dataset[(Double, Double)] 
required: org.apache.spark.rdd.RDD[(Double, Double)] 
    val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel)

錯誤

在下面的代碼：

val testScoreAndLabel = testResults. 
    select("Label","ModelProbability"). 
    map{ case Row(l:Double,p:Vector) => (p(1),l) } 
val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel)

從錯誤似乎testScoreAndLabel是sql.Dataset類型，但BinaryClassificationMetrics期望一個RDD。

如何將sql.Dataset轉換爲RDD？

來源

2016-11-13 Anthony

我會做這樣的事情

val testScoreAndLabel = testResults. 
    select("Label","ModelProbability"). 
    map{ case Row(l:Double,p:Vector) => (p(1),l) }

現在只是做testScoreAndLabel.rdd

val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel.rdd)

API Doc

轉換 testScoreAndLabel到RDD

來源

2016-11-13 19:36:13 mrsrinivas

發現：org.apache.spark.sql.Dataset [（雙人間，雙人間）]要求：org.apache.spark.rdd.RDD [（雙人間，雙人間）]

回答

相關問題