2012-02-14 61 views
8

我有表以下數據結構在SQL Server:創建的滿足給定的標準連續三天組

ID Date  Allocation 
1, 2012-01-01, 0 
2, 2012-01-02, 2 
3, 2012-01-03, 0 
4, 2012-01-04, 0 
5, 2012-01-05, 0 
6, 2012-01-06, 5 

我需要做的是讓所有連續兩天時間,其中分配= 0,並且在下面的形式:

Start Date End Date  DayCount 
2012-01-01 2012-01-01 1 
2012-01-03 2012-01-05 3 

它是POSS能夠在SQL中做到這一點,如果是的話如何?

+0

@ istari是結束日期您的表結構中的一列 – Devjosh 2012-02-14 10:09:04

+0

您嘗試過使用光標嗎?或者您不需要遊標 – Vikram 2012-02-14 10:31:31

+0

您是指「間隔一天」中的「連續」,還是指「當行按日期排序時鄰近」?即每個唯一的日期是否在'日期'列中恰好出現一次? – gcbenison 2012-02-14 13:40:25

回答

3

在這個答案,我會假設「ID」場數的連續當通過增加日期,就像在本例中的數據不排序的列。 (如果該列不存在,則可以創建這樣的列)。

這是描述的技術herehere的一個例子。

1)在相鄰的「id」值上加入表格。這對相鄰的行。選擇「分配」字段已更改的行。將結果存儲在臨時表中,同時保持運行索引。

SET @idx = 0; 
CREATE TEMPORARY TABLE boundaries 
SELECT 
    (@idx := @idx + 1) AS idx, 
    a1.date AS prev_end, 
    a2.date AS next_start, 
    a1.allocation as allocation 
FROM allocations a1 
JOIN allocations a2 
ON (a2.id = a1.id + 1) 
WHERE a1.allocation != a2.allocation; 

這使您具有「下一個週期的開始」,並且在每行「在上期‘分配’的價值」,「前一段時間的盡頭」的表:

+------+------------+------------+------------+ 
| idx | prev_end | next_start | allocation | 
+------+------------+------------+------------+ 
| 1 | 2012-01-01 | 2012-01-02 |   0 | 
| 2 | 2012-01-02 | 2012-01-03 |   2 | 
| 3 | 2012-01-05 | 2012-01-06 |   0 | 
+------+------------+------------+------------+ 

2)我們需要在同一行中每個週期的開始和結束,所以我們需要再次組合相鄰的行。通過創建像boundaries第二臨時表,但有一個idx場做到這一點1時:

+------+------------+------------+ 
| idx | prev_end | next_start | 
+------+------------+------------+ 
| 2 | 2012-01-01 | 2012-01-02 | 
| 3 | 2012-01-02 | 2012-01-03 | 
| 4 | 2012-01-05 | 2012-01-06 | 
+------+------------+------------+ 

現在加入的idx領域,我們得到的答案是:

SELECT 
    boundaries2.next_start AS start, 
    boundaries.prev_end AS end, 
    allocation 
FROM boundaries 
JOIN boundaries2 
USING(idx); 

+------------+------------+------------+ 
| start  | end  | allocation | 
+------------+------------+------------+ 
| 2012-01-02 | 2012-01-02 |   2 | 
| 2012-01-03 | 2012-01-05 |   0 | 
+------------+------------+------------+ 

**請注意,這個答案正確地獲得「內部」期間,但錯過了開始時分配= 0且結束時分配= 5的兩個「邊緣」期間。這些可以使用UNION條款拉入,但我想提出沒有這種併發症的核心思想。

0

沒有CTE A液:

SELECT a.aDate AS StartDate 
    , MIN(c.aDate) AS EndDate 
    , (datediff(day, a.aDate, MIN(c.aDate)) + 1) AS DayCount 
FROM (
    SELECT x.aDate, x.allocation, COUNT(*) idn FROM table1 x 
    JOIN table1 y ON y.aDate <= x.aDate 
    GROUP BY x.id, x.aDate, x.allocation 
) AS a 
LEFT JOIN (
    SELECT x.aDate, x.allocation, COUNT(*) idn FROM table1 x 
    JOIN table1 y ON y.aDate <= x.aDate 
    GROUP BY x.id, x.aDate, x.allocation 
) AS b ON a.idn = b.idn + 1 AND b.allocation = a.allocation 
LEFT JOIN (
    SELECT x.aDate, x.allocation, COUNT(*) idn FROM table1 x 
    JOIN table1 y ON y.aDate <= x.aDate 
    GROUP BY x.id, x.aDate, x.allocation 
) AS c ON a.idn <= c.idn AND c.allocation = a.allocation 
LEFT JOIN (
    SELECT x.aDate, x.allocation, COUNT(*) idn FROM table1 x 
    JOIN table1 y ON y.aDate <= x.aDate 
    GROUP BY x.id, x.aDate, x.allocation 
) AS d ON c.idn = d.idn - 1 AND d.allocation = c.allocation 
WHERE b.idn IS NULL AND c.idn IS NOT NULL AND d.idn IS NULL AND a.allocation = 0 
GROUP BY a.aDate 

Example

+0

運行此時,我收到以下錯誤信息: 消息530,級別16,狀態1,行1 聲明終止。在聲明c – Istari 2012-02-14 11:17:19

3

下面將做這件事。該解決方案的要點是

  • 使用CTE讓所有連續起動和enddates的列表,Allocation = 0
  • 使用ROW_NUMBER窗函數分配取決於雙方開始和enddates rownumbers。
  • 只選擇那些記錄,既ROW_NUMBERS等於1
  • 使用DATEDIFF計算DayCount

SQL語句

;WITH r AS (
    SELECT StartDate = Date, EndDate = Date 
    FROM YourTable 
    WHERE Allocation = 0 
    UNION ALL 
    SELECT r.StartDate, q.Date 
    FROM r 
      INNER JOIN YourTable q ON DATEDIFF(dd, r.EndDate, q.Date) = 1 
    WHERE q.Allocation = 0   
) 
SELECT [Start Date] = s.StartDate 
     , [End Date ] = s.EndDate 
     , [DayCount] = DATEDIFF(dd, s.StartDate, s.EndDate) + 1 
FROM (
      SELECT * 
        , rn1 = ROW_NUMBER() OVER (PARTITION BY StartDate ORDER BY EndDate DESC) 
        , rn2 = ROW_NUMBER() OVER (PARTITION BY EndDate ORDER BY StartDate ASC) 
      FROM r   
     ) s 
WHERE s.rn1 = 1 
     AND s.rn2 = 1 
OPTION (MAXRECURSION 0) 

測試腳本

;WITH q (ID, Date, Allocation) AS (
    SELECT * FROM (VALUES 
    (1, '2012-01-01', 0) 
    , (2, '2012-01-02', 2) 
    , (3, '2012-01-03', 0) 
    , (4, '2012-01-04', 0) 
    , (5, '2012-01-05', 0) 
    , (6, '2012-01-06', 5) 
) a (a, b, c) 
) 
, r AS (
    SELECT StartDate = Date, EndDate = Date 
    FROM q 
    WHERE Allocation = 0 
    UNION ALL 
    SELECT r.StartDate, q.Date 
    FROM r 
      INNER JOIN q ON DATEDIFF(dd, r.EndDate, q.Date) = 1 
    WHERE q.Allocation = 0   
) 
SELECT s.StartDate, s.EndDate, DATEDIFF(dd, s.StartDate, s.EndDate) + 1 
FROM (
      SELECT * 
        , rn1 = ROW_NUMBER() OVER (PARTITION BY StartDate ORDER BY EndDate DESC) 
        , rn2 = ROW_NUMBER() OVER (PARTITION BY EndDate ORDER BY StartDate ASC) 
      FROM r   
     ) s 
WHERE s.rn1 = 1 
     AND s.rn2 = 1 
OPTION (MAXRECURSION 0) 
+0

@Istari之前,最大遞歸100已經用盡了 - 我已經推薦了一個maxrecursion選項來修復錯誤消息。 – 2012-02-14 12:27:47

1

與CTE但沒有ROW_NUMBER()的替代方式,

的樣本數據:

if object_id('tempdb..#tab') is not null 
    drop table #tab 

create table #tab (id int, date datetime, allocation int) 

insert into #tab 
select 1, '2012-01-01', 0 union 
select 2, '2012-01-02', 2 union 
select 3, '2012-01-03', 0 union 
select 4, '2012-01-04', 0 union 
select 5, '2012-01-05', 0 union 
select 6, '2012-01-06', 5 union 
select 7, '2012-01-07', 0 union 
select 8, '2012-01-08', 5 union 
select 9, '2012-01-09', 0 union 
select 10, '2012-01-10', 0 

查詢:

;with cte(s_id, e_id, b_id) as (
    select s.id, e.id, b.id 
    from #tab s 
    left join #tab e on dateadd(dd, 1, s.date) = e.date and e.allocation = 0 
    left join #tab b on dateadd(dd, -1, s.date) = b.date and b.allocation = 0 
    where s.allocation = 0 
) 
select ts.date as [start date], te.date as [end date], count(*) as [day count] from (
    select c1.s_id as s, (
     select min(s_id) from cte c2 
     where c2.e_id is null and c2.s_id >= c1.s_id 
    ) as e 
    from cte c1 
    where b_id is null 
) t 
join #tab t1 on t1.id between t.s and t.e and t1.allocation = 0 
join #tab ts on ts.id = t.s 
join #tab te on te.id = t.e 
group by t.s, t.e, ts.date, te.date 

Live example at data.SE

1

採用該試樣數據:

CREATE TABLE MyTable (ID INT, Date DATETIME, Allocation INT); 
INSERT INTO MyTable VALUES (1, {d '2012-01-01'}, 0); 
INSERT INTO MyTable VALUES (2, {d '2012-01-02'}, 2); 
INSERT INTO MyTable VALUES (3, {d '2012-01-03'}, 0); 
INSERT INTO MyTable VALUES (4, {d '2012-01-04'}, 0); 
INSERT INTO MyTable VALUES (5, {d '2012-01-05'}, 0); 
INSERT INTO MyTable VALUES (6, {d '2012-01-06'}, 5); 
GO 

嘗試這種情況:

WITH DateGroups (ID, Date, Allocation, SeedID) AS (
    SELECT MyTable.ID, MyTable.Date, MyTable.Allocation, MyTable.ID 
     FROM MyTable 
     LEFT JOIN MyTable Prev ON Prev.Date = DATEADD(d, -1, MyTable.Date) 
          AND Prev.Allocation = 0 
    WHERE Prev.ID IS NULL 
     AND MyTable.Allocation = 0 
    UNION ALL 
    SELECT MyTable.ID, MyTable.Date, MyTable.Allocation, DateGroups.SeedID 
     FROM MyTable 
     JOIN DateGroups ON MyTable.Date = DATEADD(d, 1, DateGroups.Date) 
    WHERE MyTable.Allocation = 0 

), StartDates (ID, StartDate, DayCount) AS (
    SELECT SeedID, MIN(Date), COUNT(ID) 
     FROM DateGroups 
    GROUP BY SeedID 

), EndDates (ID, EndDate) AS (
    SELECT SeedID, MAX(Date) 
     FROM DateGroups 
    GROUP BY SeedID 

) 
SELECT StartDates.StartDate, EndDates.EndDate, StartDates.DayCount 
    FROM StartDates 
    JOIN EndDates ON StartDates.ID = EndDates.ID; 

查詢的第一部分是一個遞歸SELECT,這是由是所有行錨定分配= 0,並且其前一天或者不存在或者分配!= 0.這實際上會返回ID:1和3,這是您想要返回的時間段的開始日期。

該查詢的遞歸部分從錨點行開始,並查找也具有分配= 0的所有後續日期。SeedID通過所有迭代跟蹤錨定的ID。

到目前爲止的結果是這樣的:

ID   Date     Allocation SeedID 
----------- ----------------------- ----------- ----------- 
1   2012-01-01 00:00:00.000 0   1 
3   2012-01-03 00:00:00.000 0   3 
4   2012-01-04 00:00:00.000 0   3 
5   2012-01-05 00:00:00.000 0   3 

下一個子查詢使用簡單GROUP BY過濾掉所有的開始日期爲每個SeedID,並且還計算了天。

最後一個子查詢與結束日期完成相同的事情,但是這次不需要日計數,因爲我們已經有了這個。

最終的SELECT查詢將這兩者結合在一起組合起始日期和結束日期,並將它們與日計數一起返回。

1

試試看,如果它適合你 這裏你的DATE的SDATE與你的表格保持一致。

SELECT SDATE, 
CASE WHEN (SELECT COUNT(*)-1 FROM TABLE1 WHERE ID BETWEEN TBL1.ID AND (SELECT MIN(ID) FROM TABLE1 WHERE ID > TBL1.ID AND ALLOCATION!=0)) >0 THEN(
CASE WHEN (SELECT SDATE FROM TABLE1 WHERE ID =(SELECT MAX(ID) FROM TABLE1 WHERE ID >TBL1.ID AND ID<(SELECT MIN(ID) FROM TABLE1 WHERE ID > TBL1.ID AND ALLOCATION!=0))) IS NULL THEN SDATE 
ELSE (SELECT SDATE FROM TABLE1 WHERE ID =(SELECT MAX(ID) FROM TABLE1 WHERE ID >TBL1.ID AND ID<(SELECT MIN(ID) FROM TABLE1 WHERE ID > TBL1.ID AND ALLOCATION!=0))) END 
)ELSE (SELECT SDATE FROM TABLE1 WHERE ID = (SELECT MAX(ID) FROM TABLE1 WHERE ID > TBL1.ID))END AS EDATE 
,CASE WHEN (SELECT COUNT(*)-1 FROM TABLE1 WHERE ID BETWEEN TBL1.ID AND (SELECT MIN(ID) FROM TABLE1 WHERE ID > TBL1.ID AND ALLOCATION!=0)) <0 THEN 
(SELECT COUNT(*) FROM TABLE1 WHERE ID BETWEEN TBL1.ID AND (SELECT MAX(ID) FROM TABLE1 WHERE ID > TBL1.ID)) ELSE 
(SELECT COUNT(*)-1 FROM TABLE1 WHERE ID BETWEEN TBL1.ID AND (SELECT MIN(ID) FROM TABLE1 WHERE ID > TBL1.ID AND ALLOCATION!=0)) END AS DAYCOUNT 
FROM TABLE1 TBL1 WHERE ALLOCATION = 0 
AND (((SELECT ALLOCATION FROM TABLE1 WHERE ID=(SELECT MAX(ID) FROM TABLE1 WHERE ID < TBL1.ID))<> 0) OR (SELECT MAX(ID) FROM TABLE1 WHERE ID < TBL1.ID)IS NULL);