5
val client = Seq((1,"A",10),(2,"A",5),(3,"B",56)).toDF("ID","Categ","Amnt")
+---+-----+----+
| ID|Categ|Amnt|
+---+-----+----+
| 1| A| 10|
| 2| A| 5|
| 3| B| 56|
+---+-----+----+
我想獲取ID的數量和類別的總金額:
+-----+-----+---------+
|Categ|count|sum(Amnt)|
+-----+-----+---------+
| B| 1| 56|
| A| 2| 15|
+-----+-----+---------+
是否有可能做的次數和金額,而不必須加入?
client.groupBy("Categ").count
.join(client.withColumnRenamed("Categ","cat")
.groupBy("cat")
.sum("Amnt"), 'Categ === 'cat)
.drop("cat")
也許這樣的事情:
client.createOrReplaceTempView("client")
spark.sql("SELECT Categ count(Categ) sum(Amnt) FROM client GROUP BY Categ").show()