2012-05-12 64 views
4

我有這樣一個表:如何合併窗口內的重疊時間?

CREATE TABLE #TEMP (Name VARCHAR(255), START_TIME datetime, END_TIME datetime); 

INSERT INTO #TEMP VALUES('John', '2012-01-01 09:00:01', '2012-01-01 12:00:02') 
INSERT INTO #TEMP VALUES('John', '2012-01-01 09:40:01', '2012-01-01 11:00:02') 
INSERT INTO #TEMP VALUES('John', '2012-01-02 05:00:01', '2012-01-02 05:15:02') 
INSERT INTO #TEMP VALUES('David', '2012-01-04 05:00:01', '2012-01-04 05:15:02') 
INSERT INTO #TEMP VALUES('David', '2012-01-05 07:01:01', '2012-01-05 15:15:02') 

SELECT * 
FROM #TEMP 

DROP TABLE #TEMP 

而且數據是:

 Name START_TIME     END_TIME 
1 John 2012-01-01 09:00:01.000 2012-01-01 12:00:02.000 
2 John 2012-01-01 09:40:01.000 2012-01-01 11:00:02.000 
3 John 2012-01-02 05:00:01.000 2012-01-02 05:15:02.000 
4 David 2012-01-04 05:00:01.000 2012-01-04 05:15:02.000 
5 David 2012-01-05 07:01:01.000 2012-01-05 08:15:02.000 

給定一個數說,6,我試圖做此表GROUP BY和合並重疊的時間在6小時前後的窗口內。因此,在上表中,行12將合併成一個單一的行,因爲它們包含重疊的時間範圍:

John 2012-01-01 06:00:01.000 2012-01-01 18:00:02.000 

45將被合併,因爲從07:01:01.000減去6小時落入的窗口行4

在包含大約一百萬行的大型表上進行此操作的方法是否有效?

+0

我認爲你在anwer文本中有錯誤。當你說第一排變爲06:00:01.000 - 18:00:02.000不是03:00:01.000 - 18:00:02.000? (09:00 - 6h = 03:00而不是06:00) – danihp

回答

2

我認爲最好的方式做到這一點是建立一個windows表,並加入#TEMP表與這個新的窗口表:

1)第1步,準備靠窗的桌子與所有可能的窗戶縫隙(包含overlaping窗口):

SELECT 
     Name, 
     dateadd(hour, -6, start_time) as start_w, 
     dateadd(hour, +6, start_time) as end_w 
    into #possible_windows 
    FROM #TEMP 

2)對臨時表創建索引來提高性能

create index pw_idx on #possible_windows (Name, start_w) 

3)排除在自加入塞萊overlaping窗口克拉。這是創建索引的原因:

select p2.* 
    into #myWindows 
    from #possible_windows p1 
    right outer join #possible_windows p2 
    on p1.name = p2.name and 
     p2.start_w > p1.start_W and p2.start_w <= p1.end_w 
    where p1.name is null 

4)加入你的表#myWindows或直接使用它。

工作:

SELECT 
    Name, 
    dateadd(hour, -6, start_time) as start_w, 
    dateadd(hour, +6, start_time) as end_w, 
    ROW_NUMBER() over(partition by Name order by Name, 
        dateadd(hour, -6, start_time)) as rn 
into #possible_windows 
FROM #TEMP 

create index pw_idx on #possible_windows (Name, start_w) 

select p2.* 
from #possible_windows p1 
right outer join #possible_windows p2 
    on p1.name = p2.name and 
    p2.start_w > p1.start_W and p2.start_w <= p1.end_w 
where p1.name is null 

結果:

Name start_w  end_w   rn 
----- ------------- ------------- -- 
David 2012-01-03 23:00:012012-01-04 11:00:011 
David 2012-01-05 01:01:012012-01-05 13:01:012 
John 2012-01-01 03:00:012012-01-01 15:00:011 
John 2012-01-01 23:00:012012-01-02 11:00:013 

PE:請回到你的性能測試!