2010-04-01 108 views
6

合併重疊日期間隔是否有更好的方法?
我提出的解決方案非常簡單,現在我想知道是否有其他人更好地瞭解如何做到這一點。合併重疊日期間隔

/***** DATA EXAMPLE *****/ 
DECLARE @T TABLE (d1 DATETIME, d2 DATETIME) 
INSERT INTO @T (d1, d2) 
     SELECT '2010-01-01','2010-03-31' UNION SELECT '2010-04-01','2010-05-31' 
    UNION SELECT '2010-06-15','2010-06-25' UNION SELECT '2010-06-26','2010-07-10' 
    UNION SELECT '2010-08-01','2010-08-05' UNION SELECT '2010-08-01','2010-08-09' 
    UNION SELECT '2010-08-02','2010-08-07' UNION SELECT '2010-08-08','2010-08-08' 
    UNION SELECT '2010-08-09','2010-08-12' UNION SELECT '2010-07-04','2010-08-16' 
    UNION SELECT '2010-11-01','2010-12-31' UNION SELECT '2010-03-01','2010-06-13' 

/***** INTERVAL ANALYSIS *****/ 
WHILE (1=1) BEGIN 
    UPDATE t1 SET t1.d2 = t2.d2 
    FROM @T AS t1 INNER JOIN @T AS t2 ON 
      DATEADD(day, 1, t1.d2) BETWEEN t2.d1 AND t2.d2 
    IF @@ROWCOUNT = 0 BREAK 
END 

/***** RESULT *****/ 
SELECT StartDate = MIN(d1) , EndDate = d2 
FROM @T 
GROUP BY d2 
ORDER BY StartDate, EndDate 

/***** OUTPUT *****/ 
/***** 
StartDate EndDate 
2010-01-01 2010-06-13 
2010-06-15 2010-08-16 
2010-11-01 2010-12-31 
*****/ 
+1

是間隔開開,閉閉,開閉或閉開?這很重要,因爲最終條件略有不同。出於多種目的,開放式關閉(包括第一次約會,不包括第二次約會)是最好的表現形式;公開(包括兩端)往往是人們想到的。 – 2010-04-01 14:56:33

+0

喬納森,我在考慮當(開始日期和結束日期)天都是期間的一部分的情況。 – leoinfo 2010-04-01 15:09:52

+0

可以單程執行,但它是一個遊標實現,因此它取決於數據集的大小。 – 2010-05-02 11:22:06

回答

0

在此解決方案中,我創建了一個臨時日曆表,該表存儲一個範圍內每天的值。這種類型的表可以是靜態的。另外,從2009-12-31開始,我只能存儲400個奇怪的日期。很明顯,如果你的日期跨度較大,你需要更多的價值。

此外,該解決方案將只使用SQL Server 2005+在我使用的是CTE。

With Calendar As 
    (
    Select DateAdd(d, ROW_NUMBER() OVER (ORDER BY s1.object_id), '1900-01-01') As [Date] 
    From sys.columns as s1 
     Cross Join sys.columns as s2 
    ) 
    , StopDates As 
    (
    Select C.[Date] 
    From Calendar As C 
     Left Join @T As T 
      On C.[Date] Between T.d1 And T.d2 
    Where C.[Date] >= (Select Min(T2.d1) From @T As T2) 
     And C.[Date] <= (Select Max(T2.d2) From @T As T2) 
     And T.d1 Is Null 
    ) 
    , StopDatesInUse As 
    (
    Select D1.[Date] 
    From StopDates As D1 
     Left Join StopDates As D2 
      On D1.[Date] = DateAdd(d,1,D2.Date) 
    Where D2.[Date] Is Null 
    ) 
    , DataWithEariestStopDate As 
    (
    Select * 
    , (Select Min(SD2.[Date]) 
     From StopDatesInUse As SD2 
     Where T.d2 < SD2.[Date]) As StopDate 
    From @T As T 
    ) 
Select Min(d1), Max(d2) 
From DataWithEariestStopDate 
Group By StopDate 
Order By Min(d1) 

編輯在2009年使用日期的問題無關,與最終的查詢。問題是日曆表不夠大。我在2009-12-31開始了日程表。我已經在1900-01-01開始修改它。

+0

您的代碼正在合併不應該合併的時間間隔。使用此初始間隔/ **/SELECT'2009-01-01','2009-01-01'UNION SELECT'2009-01-03','2009-01-03'/ ** /代碼將返回單個期間:2009-01-01至2009-01-03。在這種情況下,2009-01-02不應包含在結果區間中。 – leoinfo 2010-04-07 19:07:35

+0

首先,您應該添加模式,具體是否D1 = D2。您的示例數據都沒有表明這一點。其次,如果您**將** {2010-01-01,2010-01-01}添加到現有示例數據中,則第一個範圍仍應爲2010-01-01至2010-06-13,因爲第一個條目在你的例子中涵蓋2010-01-01至2010-03-31。第三,如果您將**示例中的第一個條目替換爲{2010-01-01,2010-01-01},{2010-03-01,2010-03-01},則我的查詢結果仍然是正確的。做出這一改變後,前兩項出現爲{2010-01-01,2010-01-01},{2010-03-01,2010-06-13}。 – Thomas 2010-04-07 20:40:08

+0

還有一種情況,如果您僅替換{2010-01-01,2010-01-01},{2010-03-01,2010-03-01}中的所有條目,則會得到相同的兩個條目。 – Thomas 2010-04-07 20:42:28

0

試試這個

;WITH T1 AS 
(
    SELECT d1, d2, ROW_NUMBER() OVER(ORDER BY (SELECT 0)) AS R 
    FROM @T 
), NUMS AS 
(
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 0)) AS R 
    FROM T1 A 
    CROSS JOIN T1 B 
    CROSS JOIN T1 C 
), ONERANGE AS 
(
    SELECT DISTINCT DATEADD(DAY, ROW_NUMBER() OVER(PARTITION BY T1.R ORDER BY (SELECT 0)) - 1, T1.D1) AS ELEMENT 
    FROM T1 
    CROSS JOIN NUMS 
    WHERE NUMS.R <= DATEDIFF(DAY, d1, d2) + 1 
), SEQUENCE AS 
(
    SELECT ELEMENT, DATEDIFF(DAY, '19000101', ELEMENT) - ROW_NUMBER() OVER(ORDER BY ELEMENT) AS rownum 
    FROM ONERANGE 
) 
SELECT MIN(ELEMENT) AS StartDate, MAX(ELEMENT) as EndDate 
FROM SEQUENCE 
GROUP BY rownum 

的基本思路是,首先展開的現有數據,所以你得到一個單獨的行的每一天。這在ONERANGE

完成之後,確定之間的關係如何日期增量和行號做的方式。 現有範圍/島內的差異保持不變。只要你到一個新的數據孤島,它們之間的差異增加,因爲超過1日期增量,同時通過1

13

行數遞增我一直在尋找同樣的解決辦法和整個這個職位上來到Combine overlapping datetime to return single overlapping range record

上有Packing Date Intervals另一個線程。

我與不同日期範圍,包括這裏列出的進行了測試,它每次都正常工作。


SELECT 
     s1.StartDate, 
     --t1.EndDate 
     MIN(t1.EndDate) AS EndDate 
FROM @T s1 
INNER JOIN @T t1 ON s1.StartDate <= t1.EndDate 
    AND NOT EXISTS(SELECT * FROM @T t2 
       WHERE t1.EndDate >= t2.StartDate AND t1.EndDate < t2.EndDate) 
WHERE NOT EXISTS(SELECT * FROM @T s2 
       WHERE s1.StartDate > s2.StartDate AND s1.StartDate <= s2.EndDate) 
GROUP BY s1.StartDate 
ORDER BY s1.StartDate 

結果是:

StartDate | EndDate 
2010-01-01 | 2010-06-13 
2010-06-15 | 2010-06-25 
2010-06-26 | 2010-08-16 
2010-11-01 | 2010-12-31 
+0

另外,在這裏找到另一個解釋如何實現這個的例子:http://www.sqlmag.com/blog/puzzled-by-t-sql-blog-15/tsql/packing-date-intervals-136831 – user1045402 2011-11-23 11:50:29

+0

你可以編輯自己的答案以添加更多信息,只需點擊答案底部的「編輯」鏈接即可。 – ForceMagic 2012-11-06 23:15:41

+0

作品完美,簡潔! – ensisNoctis 2016-09-08 12:41:19