0
我有6列這樣的數據幀:如何星火的數據幀轉換爲嵌套數據幀
df.printSchema
root
|-- d1: string (nullable = true)
|-- d2: string (nullable = true)
|-- d3: string (nullable = true)
|-- m1: string (nullable = true)
|-- m2: string (nullable = true)
|-- m3: string (nullable = true)
由於種種原因,我想將它轉化成這樣的:
root
|-- d1: string (nullable = true)
|-- d2: string (nullable = true)
|-- d3: string (nullable = true)
|-- metric: nested
|-- m1: string (nullable = true)
|-- m2: string (nullable = true)
|-- m3: string (nullable = true)
我花了幾個小時,但我無法弄清楚。我到目前爲止所做的是低於
case class Metric(m1: String, m2: String, m3: String)
case class Dimension(d1: String, d2: String, d3: String, metric: Metric)
scala> df.map(row => Dimension(row.getAs[String]("d1"),
| row.getAs[String]("d2"),
| row.getAs[String]("d3"),
| Metric(row.getAs[String]("m1"),
| row.getAs[String]("m2"),
| row.getAs[String]("m3"))))
res48: org.apache.spark.rdd.RDD[Dimension] = MapPartitionsRDD[32] at map at <console>:46
scala> df.map(row => Dimension(row.getAs[String]("d1"),
| row.getAs[String]("d2"),
| row.getAs[String]("d3"),
| Metric(row.getAs[String]("m1"),
| row.getAs[String]("m2"),
| row.getAs[String]("m3")))).collect().foreach(println)
WARN scheduler.TaskSetManager: Lost task 0.0 in stage 2.0 (TID 220, hostname): java.lang.ClassNotFoundException: $line55.$read$$iwC$$iwC$Dimension
scala> df.map(row => Dimension(row.getAs[String]("d1"),
| row.getAs[String]("d2"),
| row.getAs[String]("d3"),
| Metric(row.getAs[String]("m1"),
| row.getAs[String]("m2"),
| row.getAs[String]("m3")))).toDF
res50: org.apache.spark.sql.DataFrame = [d1: string, d2: string, d3: string, metric: struct<m1:string,m2:string,m3:string>]
scala> df.map(row => Dimension(row.getAs[String]("d1"),
| row.getAs[String]("d2"),
| row.getAs[String]("d3"),
| Metric(row.getAs[String]("m1"),
| row.getAs[String]("m2"),
| row.getAs[String]("m3")))).toDF.select("d1").show()
ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerSQLExecutionStart(1,show at <console>:51,org.apache.spark.sql.DataFrame.show(DataFrame.scala:319)
請幫助我。謝謝。