SPARK SQL GROUPING SETS

我需要的列集各種組合，傳遞給我的SQL查詢作爲參數SPARK SQL GROUPING SETS

如：

Val result=sqlContext.sql(""" select col1,col2,col3,col4,col5,count(col6) from table T1 GROUP BY col1,col2,col3,col4,col5 GROUPING SETS ((col1,col2),(col3,col4),(col4, col5)) """)

有幾種組合，爲此我需要找到彙總值。有什麼辦法可以將這些列的集合作爲參數傳遞給SQL查詢，而不是手動對其進行硬編碼。

目前我已經在sql查詢中提供了所有的組合，但是如果有任何新組合出現，那麼我將需要更改查詢。我打算將所有組合都放在一個文件中，然後讀取所有內容並將其作爲參數傳遞給sql查詢。可能嗎？

實施例：表

id category age gender cust_id 

1 101 54 M 1111 
1 101 54 M 2222 
1 101 55 M 3333 
1 102 55  F 4444 

""" select id, category, age, gender, count(cust_id) from table T1 group By id, category, age, gender 
GROUPING SETS ((id,category),(id,age),(id,gender)) """

它應該產生如下結果：

group by (id, category) - count of cust_id 
1 101 3 
1 102 1 
group by (id and age) - count of cust_id 
1 54 2 
1 55 2 
group by (id and gender) - count cust_id 
1 M 3 
1 F 1

這僅僅是一個例子 - 我需要各種不同的組合傳遞給摸索SETS（不是所有的組合）類似於參數在一個去或分開

任何幫助將非常感激。

非常感謝。柱的

來源

2017-10-18 yuvraj rajpurohit

可以大家分享一個小例子數據和預期的輸出？ – mtoto

嗨，我已經添加了這個例子。任何幫助將非常感謝 –

組合集，以我的SQL查詢作爲參數

sql通過火花源不數據庫執行。它根本不會達到MySQL。

我已經提供了所有組合

如果你希望所有可能的組合你不需要GROUPING SETS。只需使用CUBE：

SELECT ... FROM table CUBE (col1,col2,col3,col4,col5)

來源

2017-10-18 12:42:11 user8795636

我的SQL意味着 - 我的火花SQL查詢和我需要做各種列不是所有列的聚合。之前我嘗試使用CUBE來獲得所有組合，但它需要很長時間並且代碼失敗，並且它還會生成所有組合，這對我的情況並不需要。 –

您可以構建SQL動態

// original slices 
var slices = List("(col1, col2)", "(col3, col4)", "(col4, col5)") 
// adding new slice 
slices = "(col1, col5)" :: slices 
// building SQL dynamically 
val q = 
s""" 
with t1 as 
(select 1 col1, 2 col2, 3 col3, 
     4 col4, 5 col5, 6 col6) 
select col1,col2,col3,col4,col5,count(col6) 
    from t1 
group by col1,col2,col3,col4,col5 
grouping sets ${slices.mkString("(", ",", ")")} 
""" 
// output 
spark.sql(q).show

結果

scala> spark.sql(q).show 
+----+----+----+----+----+-----------+ 
|col1|col2|col3|col4|col5|count(col6)| 
+----+----+----+----+----+-----------+ 
| 1|null|null|null| 5|   1| 
| 1| 2|null|null|null|   1| 
|null|null| 3| 4|null|   1| 
|null|null|null| 4| 5|   1| 
+----+----+----+----+----+-----------+

來源

2017-10-18 13:35:02

SPARK SQL GROUPING SETS

回答

相關問題