2013-08-22 31 views
0

其中蜂房表看起來是這樣的:蜂房 - 在一列計數不同的CSV

ID listOfcategories 
    1  ["a","b","b","a","c","d","d"] 
    2  ["a","a","a","c","c","c","c","e","e","e"] 
    3  ["a","b","c"] 

逗號分隔的值的數目是可變的。我想查詢每個行/ ID中不同類別的數量。 所以,我的輸出應該是這樣的:

ID  numDistCategories 
1  4 
2  3 
3  3 

回答

0

您可以使用explodeoutput separate rows for each category,然後count distinct得到你正在尋找的結果。

就是這樣。

SELECT 
    id, 
    COUNT(DISTINCT(cat)) as numDistCategories 
FROM (
    SELECT 
     id, 
     EXPLODE(listOfcategories) AS cat 
    FROM myTable) a 
GROUP BY id; 

希望有所幫助。