2017-02-03 60 views
2
customer txn_date tag running_sum 
A   1-Jan-17 1 1 
A   2-Jan-17 1 2 
A   3-Jan-17 1 3 
A   4-Jan-17 1 4 
A   5-Jan-17 1 5 
A   6-Jan-17 1 6 
A   7-Jan-17 0 0 
A   8-Jan-17 1 1 
A   9-Jan-17 1 2 
A   10-Jan-17 1 3 
A   11-Jan-17 0 0 
A   12-Jan-17 0 0 
A   13-Jan-17 1 1 
A   14-Jan-17 1 2 
A   15-Jan-17 0 0 

如何讓running_sum和和running_sum復位至零,如果標籤= 0?就像上面的示例一樣。 TIA如果標籤<> 0時如何計算運行總和並且在HIVE中tag = 0時重置爲0?

+0

你是不是想編寫一個SQL查詢?如果是的話,你已經嘗試了什麼?看[如何問](http://stackoverflow.com/help/how-to-ask) – yeputons

回答

1

你需要做的是建立「組」爲你的1和0的每個部分。你可以通過創建一個布爾標誌然後在該列上累計求和來獲得組。從那裏,您可以按照您在子查詢中創建的每個組累積您的原始tag列。

查詢

SELECT customer 
    , txn_date 
    , tag 
    , SUM(tag) OVER (PARTITION BY customer, flg_sum ORDER BY txn_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_sum 
FROM (
    SELECT * 
    , SUM(tag_flg) OVER (PARTITION BY customer ORDER BY txn_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS flg_sum 
    FROM (
    SELECT * 
     , CASE WHEN tag = 1 THEN 0 ELSE 1 END AS tag_flg 
    FROM database.table) x) y 

輸出

customer  txn_date  tag  running_sum 
A    2017-01-01  1  1 
A    2017-01-02  1  2 
A    2017-01-03  1  3 
A    2017-01-04  1  4 
A    2017-01-05  1  5 
A    2017-01-06  1  6 
A    2017-01-07  0  0 
A    2017-01-08  1  1 
A    2017-01-09  1  2 
A    2017-01-10  1  3 
A    2017-01-11  0  0 
A    2017-01-12  0  0 
A    2017-01-13  1  1 
A    2017-01-14  1  2 
A    2017-01-15  0  0 
+0

謝謝@ gobrewers14!有效!實際上,我正在探索利用滯後和領先功能,但無濟於事。 – BoyKislot

相關問題