2016-03-04 25 views
1

我有一個包含以下標題大量查詢計數TIMEPERIOD組之間實例通過

ConsumerID, TransactionDate, Revenue, OrderID

ConsumerID &訂單ID是整數 TransactionDate是時間戳

數據的結構如下上傳至BigQuery的順序表

ConsumerId || TransactionDate   || Revenue || OrderID 
1   || 2014-10-27 00:00:00 UTC || 55  || 653745 
1   || 2015-02-27 00:00:00 UTC || 65  || 767833 
1   || 2015-12-27 00:00:00 UTC || 456  || 5676324 
2   || 2014-10-27 00:00:00 UTC || 56  || 435261 
2   || 2016-02-27 00:00:00 UTC || 43  || 5632436724 

所以我的預期輸出將是

ConsumerId || Count Of Orders In Last 12 months 
    1  || 2 
    2  || 1 

我想計算一個客戶在第一個訂單起的第一個12個月內訂購的訂單數量。由總不能組:(L3:157)

在大查詢我有以下

錯誤寫了下面的

SELECT 
    ConsumerId, 
    COUNT(OrderNumber BETWEEN MIN(TransactionDate)AND DATE_ADD(MIN(TransactionDate),11,"MONTH")) AS CountOfOrdersTwelve, 
FROM 
    [ordertable.orders] 
GROUP BY 
    1, 
    2 
ORDER BY 
    ConsumerId ; 

然而這樣的錯誤。

有沒有人知道這種方式可以在bigquery中完成?

回答

2
爲你考慮(假設像下面輸入)

 (SELECT 1 AS ConsumerID, '2014-01-01' AS TransactionDate, 1 AS OrderID), 
     (SELECT 1 AS ConsumerID, '2014-05-01' AS TransactionDate, 2 AS OrderID), 
     (SELECT 1 AS ConsumerID, '2015-01-01' AS TransactionDate, 3 AS OrderID), 
     (SELECT 1 AS ConsumerID, '2015-03-01' AS TransactionDate, 4 AS OrderID), 
     (SELECT 1 AS ConsumerID, '2015-04-01' AS TransactionDate, 5 AS OrderID), 
     (SELECT 1 AS ConsumerID, '2015-05-01' AS TransactionDate, 6 AS OrderID), 

     (SELECT 2 AS ConsumerID, '2015-01-01' AS TransactionDate, 1 AS OrderID), 
     (SELECT 2 AS ConsumerID, '2015-01-01' AS TransactionDate, 2 AS OrderID), 
     (SELECT 2 AS ConsumerID, '2015-01-01' AS TransactionDate, 3 AS OrderID), 
     (SELECT 2 AS ConsumerID, '2015-03-01' AS TransactionDate, 4 AS OrderID), 
     (SELECT 2 AS ConsumerID, '2015-04-01' AS TransactionDate, 5 AS OrderID), 
     (SELECT 2 AS ConsumerID, '2016-05-01' AS TransactionDate, 6 AS OrderID), 

     (SELECT 3 AS ConsumerID, '2015-04-01' AS TransactionDate, 1 AS OrderID), 
     (SELECT 3 AS ConsumerID, '2015-05-01' AS TransactionDate, 2 AS OrderID) 

你的數據可以通過數據類型是不同的,所以您將需要相應地調整

SELECT ConsumerID, MAX(CountOfOrders) AS CountOfOrdersTwelve 
FROM (
    SELECT ConsumerID, CountOfOrders 
    FROM (
    SELECT 
     ConsumerID, TransactionDate, 
     COUNT(1) OVER(PARTITION BY ConsumerID ORDER BY TransactionDate) AS CountOfOrders, 
     FIRST_VALUE(TransactionDate) 
     OVER(PARTITION BY ConsumerID ORDER BY TransactionDate) AS firstTransactionDate 
    FROM [ordertable.orders] 
) HAVING DATEDIFF(TransactionDate, firstTransactionDate) <= 365 
) GROUP BY ConsumerID ORDER BY ConsumerID 

Compact version

快速選項 :此版本與STRING(如上述第一個解決方案中的示例)和TIMESTAMP(如在您的updat中一樣ED問題)數據類型TransactionDate

SELECT 
    ConsumerID, CountOfOrdersTwelve 
FROM (
    SELECT 
    ConsumerID, 
    TIMESTAMP_TO_SEC(TIMESTAMP(TransactionDate)) AS ts, 
    COUNT(ts) OVER (PARTITION BY ConsumerID ORDER BY ts 
     RANGE BETWEEN CURRENT ROW AND 365*24*3600 FOLLOWING) AS CountOfOrdersTwelve, 
    ROW_NUMBER() OVER(PARTITION BY ConsumerID ORDER BY ts) AS pos 
    FROM [ordertable.orders] 
) 
WHERE pos = 1 
ORDER BY ConsumerID 
+0

謝謝你的回覆,查詢運行無誤,不幸的是它返回0的結果,因爲人明明就必須有下訂單的這不可能是真的。數據的類型是否重要? ConsumerId和OrderNumber是整數,我應該澄清一下,訂單號只是一個數據庫增量編號,並沒有說明客戶的訂單數量。我真的很感謝你能給的任何幫助。 –

+0

上面的查詢確實有效 - 問題可能出在您的特定數據的數據類型中。提供你的數據的簡要例子,我會分別調整 –

+0

謝謝,對原始問題 –