2017-08-30 74 views
1

我有一個表的部分是這樣的:WHERE子句用於不選擇帶時間戳的行50ms的任何一側?

timestamp     | Source 
----------------------------+---------- 
2017-07-28 14:20:28.757464 | Stream 
2017-07-28 14:20:28.775248 | Poll 
2017-07-28 14:20:29.777678 | Poll 
2017-07-28 14:21:28.582532 | Stream 

我要實現這一點:

timestamp     | Source 
----------------------------+---------- 
2017-07-28 14:20:28.757464 | Stream 
2017-07-28 14:20:29.777678 | Poll 
2017-07-28 14:21:28.582532 | Stream 

凡第2行中的原始表已被刪除,因爲這是內爲50ms它之前或之後的時間戳。重要的是,只有當Source ='Poll'時纔會刪除行。

不知道如何用WHERE子句實現這個可能嗎?

在此先感謝您的幫助。

+0

如果我們在一行中有三個輪詢行,並且所有三個都在時間戳的50毫秒內,會發生什麼? –

+0

三次民意調查,每次少於50次,然而第三次民意調查是來自Stream的51次,那又如何? –

+0

數據中不會發生這種情況,因爲輪詢器的設置時間長於50ms。只有流數據可以在輪詢的50毫秒內。 – Harry

回答

0

無論我們做什麼,我們都可以將其限制爲Pools,然後將這些行與Streams結合。

with 
streams as (
select * 
from test 
where Source = 'Stream' 
), 
pools as (
    ... 
) 

(select * from pools) union (select * from streams) order by timestamp 

要獲得池,有不同的選擇:

相關子查詢

對於每一個我們運行額外的查詢具有相同源得到上一行一行,然後選擇只有那些行那裏沒有以前的時間戳(第一行)或以前的時間戳超過50ms。

with 
... 
pools_with_prev as (
    -- use correlated subquery 
    select 
    timestamp, Source, 
    timestamp - interval '00:00:00.05' 
     as timestamp_prev_limit, 
    (select max(t2.timestamp)from test as t2 
     where t2.timestamp < test.timestamp and 
    t2.Source = test.Source) 
     as timestamp_prev 
    from test 
), 
pools as (
    select timestamp, Source 
    from pools_with_prev 
    -- then select rows which are >50ms apart 
    where timestamp_prev is NULL or 
    timestamp_prev < timestamp_prev_limit 
) 

... 

https://www.db-fiddle.com/f/iVgSkvTVpqjNZ5F5RZVSd2/2

加入兩個滑動表

而是爲每一行運行子查詢,我們就可以創造我們的表的副本並滑動所以每個池一行的前一行加入相同的源類型。

with 
... 
pools_rn as (
-- add extra row number column 
-- rows: 1, 2, 3 
select *, 
    row_number() over (order by timestamp) as rn 
from test 
where Source = 'Pool' 
), 
pools_rn_prev as (
-- add extra row number column increased by one 
-- like sliding a copy of the table one row down 
-- rows: 2, 3, 4 
select timestamp as timestamp_prev, 
    row_number() over (order by timestamp)+1 as rn 
from test 
where Source = 'Pool' 
), 
pools as (
-- now join prev two tables on this column 
-- each row will join with its predecessor 
select timestamp, source 
from pools_rn 
    left outer join pools_rn_prev 
    on pools_rn.rn = pools_rn_prev.rn 
where 
    -- then select rows which are >50ms apart 
    timestamp_prev is null or 
    timestamp - interval '00:00:00.05' > timestamp_prev 
) 

... 

https://www.db-fiddle.com/f/gXmSxbqkrxpvksE8Q4ogEU/2

滑動窗口

現代SQL可以通過源做類似的事情,使用分區,然後使用滑動窗口與前行加入。

with 
... 
pools_with_prev as (
    -- use sliding window to join prev timestamp 
    select *, 
    timestamp - interval '00:00:00.05' 
     as timestamp_prev_limit, 
    lag(timestamp) over(
     partition by Source order by timestamp 
    ) as timestamp_prev 
    from test 
), 
pools as (
    select timestamp, Source 
    from pools_with_prev 
    -- then select rows which are >50ms apart 
    where timestamp_prev is NULL or 
    timestamp_prev < timestamp_prev_limit 
) 


... 

https://www.db-fiddle.com/f/8KfTyqRBU62SFSoiZfpu6Q/1

我相信這是最優化的。