Spark 2.0：如何將元組的RDD轉換爲DF

我將我的一個項目從Spark 1.6升級到Spark 2.0.1。下面的代碼適用於星火1.6，但它並不適用於2.0.1工作：Spark 2.0：如何將元組的RDD轉換爲DF

def count(df: DataFrame): DataFrame = { 
    val sqlContext = df.sqlContext 
    import sqlContext.implicits._ 

    df.map { case Row(userId: String, itemId: String, count: Double) => 
     (userId, itemId, count) 
    }.toDF("userId", "itemId", "count") 
    }

以下是錯誤消息：

Error:(53, 12) Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. 
    df.map { case Row(userId: String, itemId: String, count: Double) => 
     ^
Error:(53, 12) not enough arguments for method map: (implicit evidence$7: org.apache.spark.sql.Encoder[(String, String, Double)])org.apache.spark.sql.Dataset[(String, String, Double)]. 
Unspecified value parameter evidence$7. 
    df.map { case Row(userId: String, itemId: String, count: Double) => 
    ^

我試圖用df.rdd.map代替df.map，然後得到了以下錯誤：

Error:(55, 7) value toDF is not a member of org.apache.spark.rdd.RDD[(String, String, Double)] 
possible cause: maybe a semicolon is missing before `value toDF'? 
    }.toDF("userId", "itemId", "count") 
    ^

如何將Spark元組中的元組RDD轉換爲數據框？

來源

2017-06-01 Rainfield

你嘗試導入'進口spark.implicits._？ –

@ rogue-one是的，嘗試更改'val sqlContext = df.sqlContext import sqlContext.implicits._'到'val spark = df.sparkSession import spark.implicits._'，但得到了同樣的錯誤。 – Rainfield

有最有可能是語法錯誤別的地方在你的代碼，因爲你的地圖功能似乎當你得到

Error:(53, 12) not enough arguments for method map: (implicit evidence$7: org.apache.spark.sql.Encoder[(String, String, Double)])org.apache.spark.sql.Dataset[(String, String, Double)]. Unspecified value parameter evidence$7

您的代碼工作，這在我的星火外殼被正確地寫入，我測試。

來源

2017-06-01 04:15:36 HaoYuan

Spark 2.0：如何將元組的RDD轉換爲DF

回答

相關問題