2013-10-14 34 views
2

一個有效的(ISH)之後,我總結的BigQuery SQL查詢來解決以下問題:谷歌的BigQuery SQL單獨的列

我有一個表,看起來像這樣:

 

    
    Row | Col_A | Col_B | 
    --------------------- 
    1 | 2 | 3 | 
    2 | 1 | 4 | 
    3 | 5 | 7 | 
    4 | 2 | 3 | 
    5 | 6 | 1 | 

    ...and so on (>million rows) 
 

的每列的值是範圍爲[1..7]的ID。

查詢應爲每個列如下,即每總和代碼:無需使用多個SELECT查詢

 

    
    Code | Total Col_A | Total Col_B 
    -------------------------------- 
     1 |  1  |  0 
     2 |  2  |  0 
     3 |  0  |  2 
     4 |  0  |  1 
     5 |  1  |  0 
     6 |  1  |  0 
     7 |  0  |  1 
 

任何人都知道的BigQuery中的這樣的一種方式?

乾杯。

+1

請告訴我們你有什麼到目前爲止已經試過。 – Szymon

回答

2

您可以使用您的樣本數據創建公共數據集嗎?編寫對數據有效的查詢並驗證結果會更容易。

的起始查詢:

SELECT Code, COUNT(Col_A) count_column_x, COUNT(Col_B) count_column_y 
FROM [your:list.of_codes] a 
LEFT JOIN EACH [your:sample.table] b 
ON a.Code=b.Col_A 
GROUP BY 1 

(它並不完美,如果你共用一張桌子一起工作會走得更遠)

1

任何人都知道的BigQuery中這樣做,而不使用的一種方式多個SELECT?

一個選擇使用標準SQL

#standardSQL 
WITH logs AS (
    SELECT 2 AS Col_A, 3 AS Col_B UNION ALL 
    SELECT 1 AS Col_A, 4 AS Col_B UNION ALL 
    SELECT 5 AS Col_A, 7 AS Col_B UNION ALL 
    SELECT 2 AS Col_A, 3 AS Col_B UNION ALL 
    SELECT 6 AS Col_A, 1 AS Col_B 
) 
SELECT 
    id, 
    SUM(CAST(id = Col_A AS INT64)) AS Total_Col_A, 
    SUM(CAST(id = Col_B AS INT64)) AS Total_Col_B 
FROM logs, UNNEST(GENERATE_ARRAY(1,7)) AS id 
GROUP BY id 
ORDER BY id 

或用COUNTIF()

#standardSQL 
WITH logs AS (
    SELECT 2 AS Col_A, 3 AS Col_B UNION ALL 
    SELECT 1 AS Col_A, 4 AS Col_B UNION ALL 
    SELECT 5 AS Col_A, 7 AS Col_B UNION ALL 
    SELECT 2 AS Col_A, 3 AS Col_B UNION ALL 
    SELECT 6 AS Col_A, 1 AS Col_B 
) 
SELECT 
    id, 
    COUNTIF(id = Col_A) AS Total_Col_A, 
    COUNTIF(id = Col_B) AS Total_Col_B 
FROM logs, UNNEST(GENERATE_ARRAY(1,7)) AS id 
GROUP BY id 
ORDER BY id