無論我們做什麼,我們都可以將其限制爲Pools,然後將這些行與Streams結合。
with
streams as (
select *
from test
where Source = 'Stream'
),
pools as (
...
)
(select * from pools) union (select * from streams) order by timestamp
要獲得池,有不同的選擇:
相關子查詢
對於每一個我們運行額外的查詢具有相同源得到上一行一行,然後選擇只有那些行那裏沒有以前的時間戳(第一行)或以前的時間戳超過50ms。
with
...
pools_with_prev as (
-- use correlated subquery
select
timestamp, Source,
timestamp - interval '00:00:00.05'
as timestamp_prev_limit,
(select max(t2.timestamp)from test as t2
where t2.timestamp < test.timestamp and
t2.Source = test.Source)
as timestamp_prev
from test
),
pools as (
select timestamp, Source
from pools_with_prev
-- then select rows which are >50ms apart
where timestamp_prev is NULL or
timestamp_prev < timestamp_prev_limit
)
...
https://www.db-fiddle.com/f/iVgSkvTVpqjNZ5F5RZVSd2/2
加入兩個滑動表
而是爲每一行運行子查詢,我們就可以創造我們的表的副本並滑動所以每個池一行的前一行加入相同的源類型。
with
...
pools_rn as (
-- add extra row number column
-- rows: 1, 2, 3
select *,
row_number() over (order by timestamp) as rn
from test
where Source = 'Pool'
),
pools_rn_prev as (
-- add extra row number column increased by one
-- like sliding a copy of the table one row down
-- rows: 2, 3, 4
select timestamp as timestamp_prev,
row_number() over (order by timestamp)+1 as rn
from test
where Source = 'Pool'
),
pools as (
-- now join prev two tables on this column
-- each row will join with its predecessor
select timestamp, source
from pools_rn
left outer join pools_rn_prev
on pools_rn.rn = pools_rn_prev.rn
where
-- then select rows which are >50ms apart
timestamp_prev is null or
timestamp - interval '00:00:00.05' > timestamp_prev
)
...
https://www.db-fiddle.com/f/gXmSxbqkrxpvksE8Q4ogEU/2
滑動窗口
現代SQL可以通過源做類似的事情,使用分區,然後使用滑動窗口與前行加入。
with
...
pools_with_prev as (
-- use sliding window to join prev timestamp
select *,
timestamp - interval '00:00:00.05'
as timestamp_prev_limit,
lag(timestamp) over(
partition by Source order by timestamp
) as timestamp_prev
from test
),
pools as (
select timestamp, Source
from pools_with_prev
-- then select rows which are >50ms apart
where timestamp_prev is NULL or
timestamp_prev < timestamp_prev_limit
)
...
https://www.db-fiddle.com/f/8KfTyqRBU62SFSoiZfpu6Q/1
我相信這是最優化的。
如果我們在一行中有三個輪詢行,並且所有三個都在時間戳的50毫秒內,會發生什麼? –
三次民意調查,每次少於50次,然而第三次民意調查是來自Stream的51次,那又如何? –
數據中不會發生這種情況,因爲輪詢器的設置時間長於50ms。只有流數據可以在輪詢的50毫秒內。 – Harry