2016-07-06 45 views
0

處理相當複雜的SQL語句時,在跨用戶聚合的同時未獲取發生次數最多的prop_list計數。這裏是我的數據集的樣本:聚合時按時間劃分的SQL分區

user_id, term_id, time_stamp, prop_list 
u100, t10, 7:00, (a,b,c) 
u100, t10, 7:01, (a,b) 
u100, t11, 7:01, (a,b) 
u101, t10, 7:00, (a,b,c) 
u101, t10, 7:01, (a) 
u102, t10, 6:59, (a) 

所需的輸出:

term_id, term_id_distinct_count, prop_list 
t10, 3, (a,b,c) 
t11, 1, (a,b) 

這裏是我當前的代碼:

select 
    a.term_id, 
    count(distinct user_id) as term_id_distinct_count, 
    a.prop_list 
from 
    (select 
     user_id, term_id, 
     prop_list, 
     row_number() over(partition by user_id, term_id order by time_stamp asc) as row_no 
    from 
     data_table 
    group) a 
where 
    a.row_no = 1; 

注意,當爲user_id曾多次term_id,我們只希望使用先發生的那個,這就是爲什麼我按時間戳排序。

+1

你有用戶T10 term_id_distinct_count = 3 ......但是從數據看來,只有2他們......這是一個錯字還是我不明白你的問題 – objectNotFound

+2

請用你正在使用的數據庫標記你的問題。 –

回答

0

支持窗口功能,大部分數據庫支持count(distinct)作爲窗口函數,所以你可以做:

select a.term_id, term_id_distinct_count, a.prop_list 
from (select user_id, term_id, prop_list, 
      row_number() over (partition by term_id order by time_stamp asc) as seqnum, 
      count(distinct user_id) over (partition by term_id) as term_id_distinct_count 
     from data_table 
    ) a 
where seqnum = 1;