2015-11-18 15 views
0
PersistentId  UserId  EnterDate 
111    1   June 1, 2015 17:05 
112    1   June 1, 2015 17:21 
113    1   June 1, 2015 17:27 
114    1   June 1, 2015 18:25 
115    1   June 1, 2015 19:00 
116    2   June 1, 2015 18:05 
117    2   June 1, 2015 18:21 
118    2   June 1, 2015 19:27 

我想獲得用戶ID的列表,併爲每個用戶ID,使得僅排在那裏EnterDates之間的區別包括< 30分鐘計數。SQL數行,其中DIFF日期是不到30分鐘

因此,對於上述數據,輸出將是

UserId  Count 
1    3 
2    2 

應該被拉動爲用戶ID 1與persistentIds 111,114中的行,115

應該拉動針對行UserId 2與persistentIds 116,118

有關如何編寫此SQL查詢的任何想法?

+1

https://oracle-base.com/articles/misc/lag-lead-analytic-functions –

+0

'17:05','17:30'和'17:55'會計爲2還是3?後續對之間的差異小於30分鐘,但第一次和最後一次之間的差異爲50分鐘。如果你有'19:00','19:25','20:00'和'20:25',會發生什麼?用戶ID是否有兩次,因爲有兩個時間間隔小於30分鐘,或者是4次? – MT0

回答

0

兩個查詢,這兩個給你預期的結果,並使用30分鐘窗戶,但有你的要求完全不同的解釋之間enterDate用戶和計算時間......你可能想澄清的問題。

SQL Fiddle

的Oracle 11g R2架構設置

CREATE TABLE table_name (PersistentId, UserId, EnterDate) AS 
      SELECT 111, 1, to_date('June 1, 2015 17:05','Month DD, YYYY HH24:MI') FROM DUAL 
UNION ALL SELECT 112, 1, to_date('June 1, 2015 17:21','Month DD, YYYY HH24:MI') FROM DUAL 
UNION ALL SELECT 113, 1, to_date('June 1, 2015 17:27','Month DD, YYYY HH24:MI') FROM DUAL 
UNION ALL SELECT 114, 1, to_date('June 1, 2015 18:25','Month DD, YYYY HH24:MI') FROM DUAL 
UNION ALL SELECT 115, 1, to_date('June 1, 2015 19:00','Month DD, YYYY HH24:MI') FROM DUAL 
UNION ALL SELECT 116, 2, to_date('June 1, 2015 18:05','Month DD, YYYY HH24:MI') FROM DUAL 
UNION ALL SELECT 117, 2, to_date('June 1, 2015 18:21','Month DD, YYYY HH24:MI') FROM DUAL 
UNION ALL SELECT 118, 2, to_date('June 1, 2015 19:27','Month DD, YYYY HH24:MI') FROM DUAL 

查詢1 - 計數在30分鐘窗口結果:

SELECT UserId, 
     "Count" 
FROM (
    SELECT UserID, 
     COUNT(*) OVER (PARTITION BY UserId ORDER BY EnterDate RANGE BETWEEN INTERVAL '30' MINUTE PRECEDING AND CURRENT ROW) AS "Count", 
     EnterDate, 
     LEAD(EnterDate) OVER (PARTITION BY UserId ORDER BY EnterDate) AS nextEnterDate 
    FROM Table_Name 
) 
WHERE "Count" > 1 
AND EnterDate + INTERVAL '30' MINUTE < nextEnterDate 

Results

| USERID | Count | 
|--------|-------| 
|  1 |  3 | 
|  2 |  2 | 

查詢2 - 計數是另一行 30分鐘之內的所有行:

SELECT UserID, 
     COUNT(1) AS "Count" 
FROM (
    SELECT UserID, 
     EnterDate, 
     LAG(EnterDate) OVER (PARTITION BY UserId ORDER BY EnterDate) AS prevDate, 
     LEAD(EnterDate) OVER (PARTITION BY UserId ORDER BY EnterDate) AS nextDate 
    FROM Table_Name 
) 
WHERE EnterDate - INTERVAL '30' MINUTE < prevDate 
OR  EnterDate + INTERVAL '30' MINUTE > nextDate 
GROUP BY UserId 

Results

| USERID | Count | 
|--------|-------| 
|  1 |  3 | 
|  2 |  2 | 
+0

我的歉意。我沒有正確地說出問題。首先是糾正,我只在enterDates> 30分鐘之間的行中進行干涉。從最初的例子來看,它將爲userId 1 b/c計數3,它將使用persistentId 111,114,115來提取記錄。EnterDate 111和114之間的差異超過30分鐘,114和115之間也是如此。 – tsc

+0

根據提供的解決方案,我認爲這將起作用:'SELECT UserID, COUNT(1)AS「Count」 FROM( SELECT persistentid, 用戶ID, EnterDate, LAG(EnterDate)OVER(PARTITION BY用戶ID ORDER BY EnterDate)AS prevDate, LEAD(EnterDate)OVER(PARTITION BY用戶ID ORDER BY EnterDate)AS nextDate FROM TABLE_NAME ) WHERE EnterDate + INTERVAL'30'MINUTE tsc

+0

@tsc - 該查詢給出該數據的預期結果b對於其他數據它不會工作。查看這個[SQLFIDDLE](http://sqlfiddle.com/#!4/31136/2)中的兩個查詢 - 我已經用117的persistentId註釋掉了這一行,並且在另一個30分鐘內有零行'UserId',但是你的查詢(第二個)顯示一個計數(它總是會用'nextDate IS NULL'子句)。 – MT0

0

你的問題沒有明確的措辭,但根據你想要的結果,我認爲你想用NOT EXISTS過濾掉具有相同用戶ID的另一條記錄後不到30分鐘的記錄。像這樣:

with d as ( 
SELECT 111 persistent_id, 1 user_id, to_date('June 1, 2015 17:05','Month DD, YYYY HH24:MI') enter_date from dual UNION ALL 
SELECT 112 persistent_id, 1 user_id, to_date('June 1, 2015 17:21','Month DD, YYYY HH24:MI') from dual UNION ALL 
SELECT 113 persistent_id, 1 user_id, to_date('June 1, 2015 17:27','Month DD, YYYY HH24:MI') from dual UNION ALL 
SELECT 114 persistent_id, 1 user_id, to_date('June 1, 2015 18:25','Month DD, YYYY HH24:MI') from dual UNION ALL 
SELECT 115 persistent_id, 1 user_id, to_date('June 1, 2015 19:00','Month DD, YYYY HH24:MI') from dual UNION ALL 
SELECT 116 persistent_id, 2 user_id, to_date('June 1, 2015 18:05','Month DD, YYYY HH24:MI') from dual UNION ALL 
SELECT 117 persistent_id, 2 user_id, to_date('June 1, 2015 18:21','Month DD, YYYY HH24:MI') from dual UNION ALL 
SELECT 118 persistent_id, 2 user_id, to_date('June 1, 2015 19:27','Month DD, YYYY HH24:MI') from dual 
) 
select d.user_id, count(*) 
from d 
where not exists (SELECT 'record for same userid but less than 30 minutes earlier' 
        FROM d d2 
        WHERE d2.user_id = d.user_id 
        AND d2.enter_date between d.enter_date - (0.5/24) and d.enter_date 
        and d2.persistent_id != d.persistent_id) 
group by d.user_id     
order by d.user_id 
0

你可以使用LAG函數來得到上一個。事件

select user_id, count(*) 
from 
(with d as ( 
SELECT 111 persistent_id, 1 user_id, to_date('June 1, 2015 17:05','Month DD, YYYY HH24:MI') enter_date from dual UNION ALL 
SELECT 112 persistent_id, 1 user_id, to_date('June 1, 2015 17:21','Month DD, YYYY HH24:MI') from dual UNION ALL 
SELECT 113 persistent_id, 1 user_id, to_date('June 1, 2015 17:27','Month DD, YYYY HH24:MI') from dual UNION ALL 
SELECT 114 persistent_id, 1 user_id, to_date('June 1, 2015 18:25','Month DD, YYYY HH24:MI') from dual UNION ALL 
SELECT 115 persistent_id, 1 user_id, to_date('June 1, 2015 19:00','Month DD, YYYY HH24:MI') from dual UNION ALL 
SELECT 116 persistent_id, 2 user_id, to_date('June 1, 2015 18:05','Month DD, YYYY HH24:MI') from dual UNION ALL 
SELECT 117 persistent_id, 2 user_id, to_date('June 1, 2015 18:21','Month DD, YYYY HH24:MI') from dual UNION ALL 
SELECT 118 persistent_id, 2 user_id, to_date('June 1, 2015 19:27','Month DD, YYYY HH24:MI') from dual 
) 
select d.user_id, persistent_id, enter_date 
,lag(persistent_id) over (partition by user_id order by enter_date) 
,lag(enter_date) over (partition by user_id order by enter_date) 
,(enter_date - nvl (lag(enter_date) over (partition by user_id order by enter_date), enter_date))*24*60 duration 
from d 
) where duration < 30 
group by user_id 

--results 
    USER_ID COUNT(*) 
1 1 3 
2 2 2 
相關問題