2017-09-20 50 views
2

示例數據可能有助於解釋我試圖做的比解釋它更多,所以我將從此開始。計算指定窗口的滾動計數

這裏是我目前處理的數據:

+-------------------------+--------------+ 
|  CallStart  | CallDuration | 
+-------------------------+--------------+ 
| 2017-09-15 09:15:15.313 | 00:01:28  | 
| 2017-09-15 09:15:15.317 | 00:01:45  | 
| 2017-09-15 09:16:45.603 | 00:01:31  | 
| 2017-09-15 09:17:00.637 | 00:01:24  | 
| 2017-09-15 09:18:20.853 | 00:01:42  | 
| 2017-09-15 09:18:25.870 | 00:01:24  | 
| 2017-09-15 11:27:05.117 | 00:00:59  | 
| 2017-09-15 11:31:16.053 | 00:01:18  | 
| 2017-09-15 11:34:41.627 | 00:01:00  | 
| 2017-09-15 12:16:45.413 | 00:01:01  | 
| 2017-09-15 12:18:15.820 | 00:01:05  | 
| 2017-09-15 12:30:43.607 | 00:01:04  | 
| 2017-09-15 12:31:48.817 | 00:00:55  | 
| 2017-09-15 12:35:14.563 | 00:00:59  | 
| 2017-09-15 12:42:10.947 | 00:00:43  | 
| 2017-09-15 12:56:28.807 | 00:01:14  | 
| 2017-09-15 13:05:10.643 | 00:00:37  | 
| 2017-09-15 13:20:08.400 | 00:00:37  | 
| 2017-09-15 14:30:12.607 | 00:00:59  | 
| 2017-09-15 14:31:22.807 | 00:00:49  | 
| 2017-09-15 15:19:47.240 | 00:01:07  | 
| 2017-09-15 16:04:47.753 | 00:00:55  | 
| 2017-09-15 16:58:08.080 | 00:00:55  | 
| 2017-09-15 17:05:04.557 | 00:00:50  | 
| 2017-09-15 17:20:42.753 | 00:00:58  | 
| 2017-09-15 17:28:09.140 | 00:01:05  | 
| 2017-09-15 17:39:46.690 | 00:00:38  | 
| 2017-09-15 17:40:21.957 | 00:01:05  | 
| 2017-09-15 17:43:47.570 | 00:01:08  | 
| 2017-09-15 17:47:23.390 | 00:01:05  | 
| 2017-09-15 17:47:28.410 | 00:00:56  | 
| 2017-09-15 17:51:59.380 | 00:01:04  | 
+-------------------------+--------------+ 

我試圖讓滾動COUNT(*)的出現在這個數據在15分鐘的時間內完成數。該數據預期的結果將是以下幾點:

+-------------------------+--------------+------------------+ 
|  CallStart  | CallDuration | DropsIn15Minutes | 
+-------------------------+--------------+------------------+ 
| 2017-09-15 09:15:15.313 | 00:01:28  |    1 | 
| 2017-09-15 09:15:15.317 | 00:01:45  |    2 | 
| 2017-09-15 09:16:45.603 | 00:01:31  |    3 | 
| 2017-09-15 09:17:00.637 | 00:01:24  |    4 | 
| 2017-09-15 09:18:20.853 | 00:01:42  |    5 | 
| 2017-09-15 09:18:25.870 | 00:01:24  |    6 | 
| 2017-09-15 11:27:05.117 | 00:00:59  |    1 | 
| 2017-09-15 11:31:16.053 | 00:01:18  |    2 | 
| 2017-09-15 11:34:41.627 | 00:01:00  |    3 | 
| 2017-09-15 12:16:45.413 | 00:01:01  |    1 | 
| 2017-09-15 12:18:15.820 | 00:01:05  |    2 | 
| 2017-09-15 12:30:43.607 | 00:01:04  |    3 | 
| 2017-09-15 12:31:48.817 | 00:00:55  |    3 | 
| 2017-09-15 12:35:14.563 | 00:00:59  |    3 | 
| 2017-09-15 12:42:10.947 | 00:00:43  |    4 | 
| 2017-09-15 12:56:28.807 | 00:01:14  |    2 | 
| 2017-09-15 13:05:10.643 | 00:00:37  |    2 | 
| 2017-09-15 13:20:08.400 | 00:00:37  |    2 | 
| 2017-09-15 14:30:12.607 | 00:00:59  |    1 | 
| 2017-09-15 14:31:22.807 | 00:00:49  |    2 | 
| 2017-09-15 15:19:47.240 | 00:01:07  |    1 | 
| 2017-09-15 16:04:47.753 | 00:00:55  |    1 | 
| 2017-09-15 16:58:08.080 | 00:00:55  |    1 | 
| 2017-09-15 17:05:04.557 | 00:00:50  |    2 | 
| 2017-09-15 17:20:42.753 | 00:00:58  |    1 | 
| 2017-09-15 17:28:09.140 | 00:01:05  |    2 | 
| 2017-09-15 17:39:46.690 | 00:00:38  |    2 | 
| 2017-09-15 17:40:21.957 | 00:01:05  |    3 | 
| 2017-09-15 17:43:47.570 | 00:01:08  |    3 | 
| 2017-09-15 17:47:23.390 | 00:01:05  |    4 | 
| 2017-09-15 17:47:28.410 | 00:00:56  |    5 | 
| 2017-09-15 17:51:59.380 | 00:01:04  |    6 | 
+-------------------------+--------------+------------------+ 

的樣本數據:

Create Table #Calls 
(
    CallStart DateTime, 
    CallDuration Time(0) 
); 
Insert Into #Calls 
Values (N'2017-09-15T09:15:15.313', N'00:01:28'), 
    (N'2017-09-15T09:15:15.317', N'00:01:45'), 
    (N'2017-09-15T09:16:45.603', N'00:01:31'), 
    (N'2017-09-15T09:17:00.637', N'00:01:24'), 
    (N'2017-09-15T09:18:20.853', N'00:01:42'), 
    (N'2017-09-15T09:18:25.87', N'00:01:24'), 
    (N'2017-09-15T11:27:05.117', N'00:00:59'), 
    (N'2017-09-15T11:31:16.053', N'00:01:18'), 
    (N'2017-09-15T11:34:41.627', N'00:01:00'), 
    (N'2017-09-15T12:16:45.413', N'00:01:01'), 
    (N'2017-09-15T12:18:15.82', N'00:01:05'), 
    (N'2017-09-15T12:30:43.607', N'00:01:04'), 
    (N'2017-09-15T12:31:48.817', N'00:00:55'), 
    (N'2017-09-15T12:35:14.563', N'00:00:59'), 
    (N'2017-09-15T12:42:10.947', N'00:00:43'), 
    (N'2017-09-15T12:56:28.807', N'00:01:14'), 
    (N'2017-09-15T13:05:10.643', N'00:00:37'), 
    (N'2017-09-15T13:20:08.4', N'00:00:37'), 
    (N'2017-09-15T14:30:12.607', N'00:00:59'), 
    (N'2017-09-15T14:31:22.807', N'00:00:49'), 
    (N'2017-09-15T15:19:47.24', N'00:01:07'), 
    (N'2017-09-15T16:04:47.753', N'00:00:55'), 
    (N'2017-09-15T16:58:08.08', N'00:00:55'), 
    (N'2017-09-15T17:05:04.557', N'00:00:50'), 
    (N'2017-09-15T17:20:42.753', N'00:00:58'), 
    (N'2017-09-15T17:28:09.14', N'00:01:05'), 
    (N'2017-09-15T17:39:46.69', N'00:00:38'), 
    (N'2017-09-15T17:40:21.957', N'00:01:05'), 
    (N'2017-09-15T17:43:47.57', N'00:01:08'), 
    (N'2017-09-15T17:47:23.39', N'00:01:05'), 
    (N'2017-09-15T17:47:28.41', N'00:00:56'), 
    (N'2017-09-15T17:51:59.38', N'00:01:04'); 

我可以有些得到這個通過了以下工作:

Select CallStart, 
     CallDuration, 
     DropsIn15Minutes = 
     (
      Select Count(*) 
      From #Calls C2 
      Where C2.CallStart Between DateAdd(Minute, -15, C1.CallStart) 
           And  C1.CallStart 
     ) 
From #Calls C1 

但是,我想避免子查詢,以COUNT(*) OVER()(或任何其他解決方案,如果可能的話)爲例。

這可能嗎?或者子查詢是否適合這個問題?其他

+0

爲什麼一些連續的行具有相同的數字?你是否需要每組或連續數字的總數? –

+0

@VamsiPrabhala一些連續的行具有相同的數字,因爲他們之間有足夠的時間過去了,一個人已經從15分鐘的窗口中走出。這只是查看記錄的計數,沒有分組。 – Siyual

+2

毫秒精度真的很重要,還是我們可以將毫秒截斷爲秒? –

回答

2

的一種方式 - 這可能如果表是在大的範圍內進行比嵌套循環連接 - 將首先創建一個數字表...

CREATE TABLE dbo.Numbers 
(
N INT PRIMARY KEY 
); 

    WITH E1(N) AS 
    (
     SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
     SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
     SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 
    )          -- 1*10^1 or 10 rows 
    , E2(N) AS (SELECT 1 FROM E1 a, E1 b) -- 1*10^2 or 100 rows 
    , E4(N) AS (SELECT 1 FROM E2 a, E2 b) -- 1*10^4 or 10,000 rows 
    , E8(N) AS (SELECT 1 FROM E4 a, E4 b) -- 1*10^8 or 100,000,000 rows 
INSERT INTO dbo.Numbers 
    SELECT TOP (60*60*24) -1 + ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM E8; 

然後沿着以下幾行使用一些東西。

WITH Calls 
    AS (SELECT *, 
       --pre-truncate all call starts to second precision 
       CallStart_sec = DATEADD(SECOND, DATEDIFF(SECOND, '20000101', CallStart), '20000101') 
     FROM #Calls), 
    PreAgg 
    AS (SELECT CallStart_sec, 
       COUNT(*) AS Cnt 
     FROM Calls 
     GROUP BY CallStart_sec), 
    Dates(D) 
    --Todo - something else other than hardcoding the dates 
    AS (SELECT CAST('2017-09-15' AS DATETIME2)), 
    RT 
    AS (SELECT *, 
       Cume = SUM(Cnt) OVER (ORDER BY DATEADD(SECOND, N.N, Dates.D) 
           ROWS BETWEEN 900 PRECEDING AND CURRENT ROW) 
     FROM Dates 
       INNER JOIN dbo.Numbers N 
        ON N.N BETWEEN 0 AND 86399 
       LEFT JOIN PreAgg P 
        ON P.CallStart_sec = DATEADD(SECOND, N.N, Dates.D)) 
SELECT C.CallStart_sec AS CallStart, 
     CallDuration, 
     DropsIn15Minutes = Cume 
FROM Calls C 
     JOIN RT 
     ON RT.CallStart_sec = C.CallStart_sec 
+0

啊,我應該在發佈之前刷新頁面。我會刪除我的答案,因爲它跟你的建議基本相同。這工作得很好,謝謝! – Siyual

2

兩個選項:

使用cross apply()

select 
    CallStart 
    , CallDuration 
    , DropsIn15Minutes 
from calls c 
    outer apply (
    select DropsIn15Minutes = count(*) 
    from calls i 
    where i.callstart >= dateadd(minute,-15,c.CallStart) 
     and i.callstart <= c.CallStart 
    ) x 
order by c.CallStart 

使用inner join

select 
    c.CallStart 
    , c.CallDuration 
    , DropsIn15Minutes = count(i.CallStart) 
from calls c 
    inner join calls i 
    on i.callstart >= dateadd(minute,-15,c.CallStart) 
    and i.callstart <= c.CallStart 
group by c.CallStart, c.CallDuration 
order by c.CallStart 

rextester演示:http://rextester.com/PFCE43712

比較所有3(包括你)的執行計劃:dbfiddle.uk demo

+0

這些工作,雖然這些似乎需要約兩倍的時間來運行:(。我更想知道是否有辦法做到這一點,而不是兩次擊中表格。 – Siyual

+0

@Siyual - 用你的樣本數據打了33次。而不是兩次 - 因爲它一次獲得外部行,然後再次獲得每個外部行。所以你肯定會想要一個索引來支持這些搜索。儘管取決於數據的密度,但仍然可能比我的更快。如果你每秒有很多行,我希望我的答案中的那一行更具競爭力。 –

-1
Select CallStart, 
     CallDuration, 
COUNT(*) OVER (PARTITION BY trunc(CallStart,'mi') - 
     numtodsinterval(mod(to_char(CallStart,'mi'),15),'minute')) 
as DropsIn15Minutes 
From #Calls C1 

讓我們嘗試了這一點,看看它是如何工作的。

特德

+1

我不知道這是什麼語法,但它不是SQL Server。 – Siyual

+0

哎呀,對不起!是的,這是Oracle –