2015-02-06 32 views
1

我需要一列內找到日期範圍和以簡明的格式它們序列化(start - end爲一系列或date對於單日期範圍)。序列化在一個「範圍」格式的日期,而無需使用過程

我有一個CTE(readings)返回類似於一個數據集:

ID   VALUE DATE 
1234567  A  2012-05-09 
1234567  A  2012-05-10 
1234567  A  2012-05-11 
1234567  A  2012-05-16 
1234567  A  2012-05-17 
1234567  A  2012-05-20 
1234567  B  2012-05-11 
1234567  B  2012-05-12 
1234567  B  2012-05-13 
1234567  B  2012-05-14 

我已經能夠得到:

ID   VALUE TOTAL_DAYS DATES 
1234567  A  6   2012-05-09; 2012-05-10; 2012-05-11; 2012-05-16; 2012-05-17; 2012-05-20 
1234567  B  4   2012-05-11; 2012-05-12; 2012-05-13; 2012-05-14 

使用:

readings AS (
... 
) 
, 
reading_aggr AS (

    SELECT ID, [VALUE] 
      ,count(distinct date) TOTAL_DAYS 
      ,STUFF((
       SELECT '; ' + cast(date as varchar) 
       FROM readings r0 
       WHERE id=r0.id 
        AND value=r0.value 
       ORDER BY date 
       FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)' 
      ),1,2,'') AS DATES 
    FROM readings 
    GROUP BY id, [value] 
) 

SELECT * FROM readings_aggr 

我想格式化爲:

ID   VALUE TOTAL_DAYS DATES 
1234567  A  6   2012-05-09 - 2012-05-11; 2012-05-16 - 2012-05-17; 2012-05-20 
1234567  B  4   2012-05-11 - 2012-05-14 

這可能不使用程序方法嗎?

+0

您的查詢沒有使用程序,所以我不知道問題究竟是什麼。 – 2015-02-06 14:53:25

+0

我不希望有人發佈解決方案,將查詢重新設計爲存儲過程。 – craig 2015-02-06 15:12:23

回答

3

您可以使用此查詢:

SELECT ID, VALUE, MIN([DATE]) AS startDate, MAX([DATE]) AS endDate 
    FROM (
     SELECT ID, VALUE, DATE, 
      DATEDIFF(Day, '1900-01-01' , [DATE])- ROW_NUMBER() OVER(PARTITION BY ID, VALUE ORDER BY [DATE]) AS DateGroup 
     FROM readings) rGroups 
    GROUP BY ID, VALUE, DateGroup 

得到一個包含所有啓動表表達式 - 結束的時間間隔你的數據:

ID  VALUE startDate endDate 
-------------------------------------- 
1234567 A  2012-05-09 2012-05-11 
1234567 A  2012-05-16 2012-05-17 
1234567 A  2012-05-20 2012-05-20 
1234567 B  2012-05-11 2012-05-14 

然後用內reading_aggr上面的查詢:

;WITH start_end_readings AS (
    SELECT ID, VALUE, MIN([DATE]) AS startDate, MAX([DATE]) AS endDate 
    FROM (
     SELECT ID, VALUE, DATE, DATEDIFF(Day, '1900-01-01' , [DATE])- ROW_NUMBER() OVER(PARTITION BY ID, VALUE ORDER BY [DATE]) AS DateGroup 
     FROM readings) rGroups 
    GROUP BY ID, VALUE, DateGroup 
), readings_aggr AS (

    SELECT ID, [VALUE] 
      ,count(distinct date) TOTAL_DAYS 
      ,STUFF((
       SELECT '; ' + cast(startDate as varchar) + 
         CASE WHEN startDate <> endDate THEN ' - ' + cast(endDate as varchar) 
          ELSE '' 
         END 
       FROM start_end_readings r0 
       WHERE r1.id=r0.id AND r1.value=r0.value 
       ORDER BY startDate 
       FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)' 
      ),1,2,'') AS DATES 
    FROM readings AS r1 
    GROUP BY id, [value] 
) 
SELECT * FROM readings_aggr 

得到想要的結果:

ID  VALUE TOTAL_DAYS DATES 
=========================================================================== 
1234567 A  6   2012-05-09 - 2012-05-11; 2012-05-16 - 2012-05-17; 2012-05-20 
1234567 B  4   2012-05-11 - 2012-05-14 

SQL Fiddle Demo here

+0

在遇到另一個字符串聚合需求後,我終於放棄並嘗試了SQLCLR。調用的次數少得多,如果您有覆蓋索引,則更容易優化,特別是如果您需要彙總多個字段 – 2015-02-06 15:58:50

+0

@PanagiotisKanavos:您是否有這樣的例子? – craig 2015-02-11 20:11:15

+0

@ jlee-tesik的答案指向[來自MSDN的示例](https://msdn.microsoft.com/en-us/library/ms165055%28v=vs.90%29.aspx)。這只是字符串聚合部分,但您仍然需要創建計算範圍的查詢。 – 2015-02-12 08:07:19

1

您可能可以使用CLR聚合來完成此操作。

下面是MSDN的一個示例,它將數據連接在一起。只需將逗號更改爲分號,就可以使用更簡潔的查詢來獲得當前的格式。

https://msdn.microsoft.com/en-us/library/ms165055%28v=vs.90%29.aspx

一旦這項工作到位,你可以調整的累積和/或終止方法來查看數據和輸出範圍在可能的情況。您可能想要將值累加到像SortedList而不是StringBuilder之類的東西,然後在Terminate方法中執行範圍分析。

1

你可以不喜歡它:

DECLARE @t TABLE (ID INT, V CHAR(1), D DATE) 

INSERT INTO @t 
VALUES (1234567, 'A', '2012-05-09'), 
     (1234567, 'A', '2012-05-10'), 
     (1234567, 'A', '2012-05-11'), 
     (1234567, 'A', '2012-05-16'), 
     (1234567, 'A', '2012-05-17'), 
     (1234567, 'A', '2012-05-20'), 
     (1234567, 'B', '2012-05-11'), 
     (1234567, 'B', '2012-05-12'), 
     (1234567, 'B', '2012-05-13'), 
     (1234567, 'B', '2012-05-14'); 

WITH cte1 
      AS (SELECT ID , 
         V , 
         CASE WHEN MIN(D) <> MAX(D) 
          THEN CONVERT(NVARCHAR(MAX), MIN(D), 121) + ' - ' 
            + CONVERT(NVARCHAR(MAX), MAX(D), 121) 
          ELSE CONVERT(NVARCHAR(MAX), MIN(D), 121) 
         END AS D , 
         COUNT(*) AS cn 
       FROM  (SELECT ID , 
            V , 
            D , 
            DATEADD(dd, 
              -ROW_NUMBER() OVER (PARTITION BY V ORDER BY D), 
              D) AS rn 
          FROM  @t 
         ) a 
       GROUP BY ID , 
         V , 
         rn 
      ),-- SELECT * FROM cte1, 
     cte2 
      AS (SELECT ID , 
         V , 
         SUM(cn) TOTAL_DAYS , 
         STUFF((SELECT '; ' + D 
           FROM  cte1 r0 
           WHERE cte1.id = r0.id 
             AND cte1.V = r0.V 
         FOR XML PATH('') , 
            TYPE).value('(./text())[1]', 'VARCHAR(MAX)'), 
           1, 2, '') AS DATES 
       FROM  cte1 
       GROUP BY id , 
         V 
      ) 
    SELECT * 
    FROM cte2 

輸出:

ID  V TOTAL_DAYS DATES 
1234567 A 6   2012-05-09 - 2012-05-11; 2012-05-16 - 2012-05-17; 2012-05-20 
1234567 B 4   2012-05-11 - 2012-05-14 

的想法首先得到羣島(https://www.simple-talk.com/sql/t-sql-programming/the-sql-of-gaps-and-islands-in-sequences/),然後運用你的東西。我知道@Betsos破壞了我,但這有點不同。但這個想法是一樣的。