2013-06-20 52 views
1

我有一個3列(id(int),日期(日期),狀態(bool))的表。按日期順序查找模式

這樣

id date  Status 
1 2012-10-18 1 
1 2012-10-19 1 
1 2012-10-20 0 
1 2012-10-21 0 
1 2012-10-22 0 
1 2012-10-23 0 
1 2012-10-24 1 
1 2012-10-25 0 
1 2012-10-26 0 
1 2012-10-27 0 
1 2012-10-28 1 
2 2012-10-19 0 
2 2012-10-20 0 
2 2012-10-21 0 
2 2012-10-22 1 
2 2012-10-23 1 

假定日期列是連續的,並有日期之間沒有間隙。

我怎樣才能找到所有3個連續的零(在狀態欄)和他們的下一天的狀態?

這樣

id startDate  endDate  NextDayStatus 
1 2012-10-20 2012-10-22   0 
1 2012-10-21 2012-10-23   1 
1 2012-10-25 2012-10-27   1 
2 2012-10-19 2012-10-21   1 

表創建腳本和樣本數據

CREATE TABLE [Table1](
    [ID] [smallint] NOT NULL, 
    [Date] [date] NOT NULL, 
    [Status] [bit] NULL, 
CONSTRAINT [PK_table1] PRIMARY KEY CLUSTERED ( [ID] ASC, [Date] ASC)) 

INSERT INTO [Table1]([ID], [Date], [Status])  
SELECT 1, '2012-10-18', 1 UNION ALL 
SELECT 1, '2012-10-19', 1 UNION ALL 
SELECT 1, '2012-10-20', 0 UNION ALL 
SELECT 1, '2012-10-21', 0 UNION ALL 
SELECT 1, '2012-10-22', 0 UNION ALL 
SELECT 1, '2012-10-23', 0 UNION ALL 
SELECT 1, '2012-10-24', 1 UNION ALL 
SELECT 1, '2012-10-25', 0 UNION ALL 
SELECT 1, '2012-10-26', 0 UNION ALL 
SELECT 1, '2012-10-27', 0 UNION ALL 
SELECT 1, '2012-10-28', 1 UNION ALL 
SELECT 2, '2012-10-19', 0 UNION ALL 
SELECT 2, '2012-10-20', 0 UNION ALL 
SELECT 2, '2012-10-21', 0 UNION ALL 
SELECT 2, '2012-10-22', 1 UNION ALL 
SELECT 2, '2012-10-23', 1 

更新:

  • 如果它很重要,這一步後,我只需要濾除天 這是本月的第一,第十或第20天
  • 許多感謝託默勒格和GNB,在我的真正任務的連續零的個數爲,而不是3本樣本中,所以使用9內連接或交叉應用似乎低效
+0

什麼版本的SQL Server? 2012? – gbn

+0

SQL Server 2008 R2 SP2 –

+1

APPLY優於9 JOIN。但是你不在SQL Server 2012上,所以這是你需要的。 – gbn

回答

4

編輯,更新識別分配

這也適用,如果日期是不連續的

SELECT  T1.id, T1.[Date], MAX(X.[Date]), Y.[Status] 
FROM  Table1 T1  
    CROSS APPLY 
    ( SELECT TOP 3 * 
    FROM   Table1 T2 
    WHERE   T2.id = T1.id AND T2.[Date] >= T1.Date 
    ORDER BY  T2.[Date] 
    ) X 
    CROSS APPLY 
    (SELECT TOP 4 *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY T3.[Date]) AS rn 
    FROM   Table1 T3 
    WHERE   T3.id = T1.id AND T3.[Date] >= T1.Date 
    ORDER BY  T3.[Date] 
    ) Y 
WHERE  y.rn = 4 
GROUP BY  T1.id, T1.[Date], Y.[Status] 
HAVING  SUM(CAST(X.[Status] AS tinyint)) = 0; 

爲了完整,這是方法更優雅SQL Server 2012的解決方案
此CA n,其中任何RDBMS可以使用具有適當的窗/分析支持

SELECT 
    X.id, X.startDate, X.endDate, x.nextStatus 
FROM 
    (SELECT  T1.id, T1.[Date] AS startDate, 
     LEAD(T1.[Date], 2) OVER (PARTITION BY T1.id ORDER BY T1.[Date]) AS endDate, 
     LEAD(T1.[Status], 3) OVER (PARTITION BY T1.id ORDER BY T1.[Date]) AS nextStatus, 
     SUM(CAST(T1.[Status] AS tinyint)) OVER (PARTITION BY T1.id ORDER BY T1.[Date] ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING) AS SumNext3 
    FROM   Table1 T1 
    ) X 
WHERE  SumNext3 = 0; 
+0

親愛的gbn,這似乎並沒有工作。我測試了幾次 –

+0

@imanabidi:當然,id現在打破了GROUP BY ,.對於新的樣本數據,添加期望的輸出 – gbn

+0

年,它也打破了ROW_NUMBER(),所以我想讓你的出色主意適合我的主要問題,但到目前爲止沒有收穫,請更新你的答案,如果你有時間,因爲我知道你在這方面比我好得多。 –

3
SELECT 
    z1.id, z1.[date] AS startDate ,z3.[date] AS endDate, zn.status AS NextDayStatus 
FROM 
    Table1 z1 
    INNER JOIN Table1 z2 ON z2.[date] = (
    SELECT MIN([date]) FROM Table1 WHERE [date] > z1.[date] AND id = z1.id 
) 
    INNER JOIN Table1 z3 ON z3.Date = (
    SELECT MIN([date]) FROM Table1 WHERE [date] > z2.[date] AND id = z1.id 
) 
    INNER JOIN Table1 zn ON zn.Date = (
    SELECT MIN([date]) FROM Table1 WHERE [date] > z3.[date] AND id = z1.id 
) 
WHERE 
    z1.status = 0 
    AND z2.status = 0 AND z2.id = z1.id 
    AND z3.status = 0 AND z3.id = z1.id 
    AND zn.id = z1.id 
ORDER BY 
    z1.id, z1.[date] 

上表1 (date, status, id)的索引將是最佳的。

+0

親愛的托馬拉克,這似乎並沒有工作。我測試了幾次,並添加了新的id = 2行添加 –

+1

@imanabidi你是對的,我沒有合併一個條件,有效地在'id'上組合。查看修改後的答案 – Tomalak

2

這裏的另一個解決方案,也將在許多SQL產品的工作(那些支持窗口功能),但特別是SQL Server 2005及更高版本上:

WITH partitioned AS (
    SELECT 
    *, 
    grp = DATEDIFF(DAY, 0, Date) 
     - ROW_NUMBER() OVER (PARTITION BY ID, Status ORDER BY Date) 
    FROM Table1 
), 
grouped AS (
    SELECT 
    ID, 
    SD = MIN(Date), 
    ED = MAX(Date) 
    FROM partitioned 
    WHERE Status = 0 
    GROUP BY 
    ID, 
    grp 
    HAVING COUNT(*) >= 3 
) 
SELECT 
    t.ID, 
    StartDate  = t.Date, 
    EndDate  = DATEADD(DAY, 2, t.Date), 
    NextDayStatus = CASE t.Date WHEN DATEADD(DAY, -2, g.ED) THEN 1 ELSE 0 END 
FROM Table1 t 
INNER JOIN grouped g 
ON t.ID = g.ID AND t.Date BETWEEN g.SD AND DATEADD(DAY, -2, g.ED) 
; 

這樣做是爲了檢測所有的「孤島」 Status = 0,挑選那些至少有3行,然後加入聚合島設置回原始表,以獲得行的資格成爲連續3行Status = 0所需子集的開始。

儘管如此:此解決方案假定任何3個連續的狀態0行後面至少有一個具有相同ID的其他行。換句話說,狀態0行的最後匹配集應該跟隨狀態1行,因爲這就是結果集無論如何表示的結果。