2012-06-14 72 views
7

時間表考慮最佳方式follwing:SQL:建立從兩個歷史表

CREATE TABLE Members (MemberID INT) 
INSERT Members VALUES (1001) 

CREATE TABLE PCPs (PCPID INT) 
INSERT PCPs VALUES (231) 
INSERT PCPs VALUES (327) 
INSERT PCPs VALUES (390) 

CREATE TABLE Plans (PlanID INT) 
INSERT Plans VALUES (555) 
INSERT Plans VALUES (762) 

CREATE TABLE MemberPCP (
    MemberID INT 
    , PCP INT 
    , StartDate DATETIME 
    , EndDate DATETIME) 
INSERT MemberPCP VALUES (1001, 231, '2002-01-01', '2002-06-30') 
INSERT MemberPCP VALUES (1001, 327, '2002-07-01', '2003-05-31') 
INSERT MemberPCP VALUES (1001, 390, '2003-06-01', '2003-12-31') 

CREATE TABLE MemberPlans (
    MemberID INT 
    , PlanID INT 
    , StartDate DATETIME 
    , EndDate DATETIME) 
INSERT MemberPlans VALUES (1001, 555, '2002-01-01', '2003-03-31') 
INSERT MemberPlans VALUES (1001, 762, '2003-04-01', '2003-12-31') 

我正在尋找一個乾淨的方式來構建會員/ PCP /計劃關係的時間表,其中,在改變PCP或成員計劃會導致結果中單獨的開始/結束行。例如,如果過了幾年,成員換了兩次了PCP和他們的計劃一次,但每次在不同的日期,我會看到類似以下內容:

MemberID PCP PlanID StartDate EndDate 
1001  231 555  2002-01-01 2002-06-30 
1001  327 555  2002-07-01 2003-03-31 
1001  327 762  2003-04-01 2003-05-31 
1001  390 762  2003-06-01 2003-12-31 

正如你所看到的,我需要一個單獨的每個日期的結果行涉及成員/ PCP /計劃關聯中的差異。我有一個解決方案,但是在WHERE子句中有很多CASE語句和條件邏輯,這是非常令人費解的。我只是覺得有一個更簡單的方法來做到這一點。

謝謝。

+0

我們可以看看你的作品嗎? –

+0

你可以發佈這個複雜的CASE語句到[SQLFiddle](http://sqlfiddle.com/),以便我們可以看到你做了什麼? –

+0

這是一件非常複雜的事情。我不知道是否有*更簡單的方法來做到這一點。所以你應該發佈你的解決方案,我們可以幫助你從那裏開始 – Lamak

回答

0

我的做法是採取的開始日期的獨特組合每個成員爲出發點,然後從那裏打造出另件查詢:

-- 
-- Traverse down a list of 
-- unique Member ID and StartDates 
-- 
-- For each row find the most 
-- recent PCP for that member 
-- which started on or before 
-- the start date of the current 
-- row in the traversal 
-- 
-- For each row find the most 
-- recent PlanID for that member 
-- which started on or before 
-- the start date of the current 
-- row in the traversal 
-- 
-- For each row find the earliest 
-- end date for that member 
-- (from a collection of unique 
-- member end dates) that happened 
-- after the start date of the 
-- current row in the traversal 
-- 
SELECT MemberID, 
    (SELECT TOP 1 PCP 
    FROM MemberPCP 
    WHERE MemberID = s.MemberID 
    AND StartDate <= s.StartDate 
    ORDER BY StartDate DESC 
) AS PCP, 
    (SELECT TOP 1 PlanID 
    FROM MemberPlans 
    WHERE MemberID = s.MemberID 
    AND StartDate <= s.StartDate 
    ORDER BY StartDate DESC 
) AS PlanID, 
    StartDate, 
    (SELECT TOP 1 EndDate 
    FROM (
    SELECT MemberID, EndDate 
    FROM MemberPlans 
    UNION 
    SELECT MemberID, EndDate 
    FROM MemberPCP) e 
    WHERE EndDate >= s.StartDate 
    ORDER BY EndDate 
) AS EndDate 
FROM ( 
    SELECT 
    MemberID, 
    StartDate 
    FROM MemberPlans 
    UNION 
    SELECT 
    MemberID, 
    Startdate 
    FROM MemberPCP 
) s 
ORDER BY StartDate 
+0

謝謝大家。所有的建議都很棒。我將這一個標記爲答案,因爲它確實允許計劃/ PCP活動範圍中的空白。 –

0

也許這會給一個爲一些想法開始:

SELECT y.memberid, y.pcp, z.planid, x.startdate, x.enddate 
    FROM (
     WITH startdates AS (

      SELECT startdate FROM memberpcp 
      UNION 
      SELECT startdate FROM memberplans 
      UNION 
      SELECT enddate + 1 FROM memberpcp 
      UNION 
      SELECT enddate + 1 FROM memberplans 

      ), enddates AS (
      SELECT enddate FROM memberpcp 
      UNION 
      SELECT enddate FROM memberplans 

     ) 

     SELECT s.startdate, e.enddate 
      FROM startdates s 
       ,enddates e 
      WHERE e.enddate = (SELECT MIN(enddate) 
           FROM enddates 
           WHERE enddate > s.startdate) 
     ) x 
     ,memberpcp y 
     ,memberplans z 

    WHERE (y.startdate, y.enddate) = (SELECT startdate, enddate FROM memberpcp WHERE startdate <= x.startdate AND enddate >= x.enddate) 
    AND (z.startdate, z.enddate) = (SELECT startdate, enddate FROM memberplans WHERE startdate <= x.startdate AND enddate >= x.enddate) 

我跑甲骨文與這些結果:

1001 231 555 01-JAN-02 30-JUN-02 
1001 327 555 01-JUL-02 31-MAR-03 
1001 327 762 01-APR-03 31-MAY-03 
1001 390 762 01-JUN-03 31-DEC-03 

的IDE一個是先定義不同的日期範圍。這是在「WITH」條款中。然後在其他表格中查找每個範圍。這裏有很多關於重疊範圍的假設,但也許是一個開始。我試過在沒有分析函數的情況下查看它,因爲tsql可能不會很好地支持分析函數。我不知道。在構建真實日期範圍時,範圍也需要由memberid構建。

1

與T-SQL兼容。我同意格倫的一般方法。

另一個建議:如果你允許在你的業務期間跳躍,這段代碼將需要進一步調整。否則,我認爲從下一條記錄的StartDate延遲EndDate值會更好,因爲您的代碼具有更多的受控行爲。在這種情況下,您希望在數據到達此查詢之前確保規則。

編輯:剛剛瞭解到Andriy M的帖子中的聲明和SQL小提琴。你也可以see my answer at SQL Fiddle

編輯:修正了Andriy指出的錯誤。

WITH StartDates AS (
SELECT MemberId, StartDate FROM MemberPCP UNION 
SELECT MemberId, StartDate FROM MemberPlans UNION 
SELECT MemberId, EndDate + 1 FROM MemberPCP UNION 
SELECT MemberId, EndDate + 1 FROM MemberPlans 
), 
EndDates AS (
SELECT MemberId, EndDate = StartDate - 1 FROM MemberPCP UNION 
SELECT MemberId, StartDate - 1 FROM MemberPlans UNION 
SELECT MemberId, EndDate FROM MemberPCP UNION 
SELECT MemberId, EndDate FROM MemberPlans 
), 
Periods AS (
SELECT s.MemberId, s.StartDate, EndDate = min(e.EndDate) 
    FROM StartDates s 
     INNER JOIN EndDates e 
      ON s.StartDate <= e.EndDate 
      AND s.MemberId = e.MemberId 
GROUP BY s.MemberId, s.StartDate 
) 
SELECT MemberId = p.MemberId, 
     pcp.PCP, pl.PlanId, 
     p.StartDate, p.EndDate 
    FROM Periods p 
     LEFT JOIN MemberPCP pcp 
      -- because of the way we divided period, 
      -- there will be one and only one record that fits this join clause 
      ON p.StartDate >= pcp.StartDate 
      AND p.EndDate <= pcp.EndDate 
      AND p.MemberId = pcp.MemberId 
     LEFT JOIN MemberPlans pl 
      ON p.StartDate >= pl.StartDate 
      AND p.EndDate <= pl.EndDate 
      AND p.MemberId = pl.MemberId 
ORDER BY p.MemberId, p.StartDate 
+0

當兩個歷史記錄表不覆蓋相同的日期範圍時,似乎無法正常工作。但這可能不是必需的,否則這似乎工作得很好,並且可能比擴展範圍然後將它們摺疊回像我的答案更有效。 –

+0

Andriy,我看到有一個錯誤,現在糾正了。開始日期應該參與結束日期組,反之亦然。否則,就像你說的那樣,由於沒有相應的結束日期(或開始日期),因此邊緣時間段將不會被正確檢測。我改變了我的SQL Fiddle示例來演示這種情況。 – kennethc

+0

偉大的工作,如果可以的話,會再次投票贊成! –

1

作爲也許不是最有效的,但至少簡單而直接的解決方案,我將執行以下操作:

  • 1)擴展範圍;

  • 2)加入擴展範圍; 3)將結果分組。

這當然,假定僅日期被使用(即,在部分時間是00:00StartDateEndDate兩個表中)。

擴大日期範圍,我更喜歡使用一個numbers table,像這樣:

SELECT 
    m.MemberID, 
    m.PCP, 
    Date = DATEADD(DAY, n.Number, m.StartDate) 
FROM MemberPCP m 
    INNER JOIN Numbers n 
    ON n.Number BETWEEN 0 AND DATEDIFF(DAY, m.StartDate, m.EndDate) 

,類似的還有MemberPlans

以產生合併行集,我會用FULL JOIN,但如果你事先知道這兩個表涵蓋恰好相同的時間段,INNER JOIN會做一樣好:

SELECT * 
FROM MemberPCPExpanded pcp 
    FULL JOIN MemberPlansExpanded plans 
    ON pcp.MemberID = plans.MemberID AND pcp.Date = plans.Date 

現在你只需要到組所產生的行,並找到最小和最大日期的(MemberID, PCP, PlanID)每個組合:

SELECT 
    MemberID = ISNULL(pcp.MemberID, plans.MemberID),, 
    pcp.PCP, 
    plans.PlanID, 
    StartDate = MIN(ISNULL(pcp.Date, plans.Date)), 
    EndDate = MAX(ISNULL(pcp.Date, plans.Date)) 
FROM MemberPCPExpanded pcp 
    FULL JOIN MemberPlansExpanded plans 
    ON pcp.MemberID = plans.MemberID AND pcp.Date = plans.Date 
GROUP BY 
    ISNULL(pcp.MemberID, plans.MemberID), 
    pcp.PCP, 
    plans.PlanID 

請注意,如果您使用INNER JOIN而不是FULL JOIN,你並不需要所有那些ISNULL()表達式,只要選擇表格的列就足夠了。代替ISNULL(pcp.MemberID, plans.MemberID)pcp.Date而不是ISNULL(pcp.Date, plans.Date)

完整的查詢可能是這樣的,那麼:

WITH MemberPCPExpanded AS (
    SELECT 
    m.MemberID, 
    m.PCP, 
    Date = DATEADD(DAY, n.Number, m.StartDate) 
    FROM MemberPCP m 
    INNER JOIN Numbers n 
     ON n.Number BETWEEN 0 AND DATEDIFF(DAY, m.StartDate, m.EndDate) 
), 
MemberPlansExpanded AS (
    SELECT 
    m.MemberID, 
    m.PlanID, 
    Date = DATEADD(DAY, n.Number, m.StartDate) 
    FROM MemberPlans m 
    INNER JOIN Numbers n 
     ON n.Number BETWEEN 0 AND DATEDIFF(DAY, m.StartDate, m.EndDate) 
) 
SELECT 
    MemberID = ISNULL(pcp.MemberID, plans.MemberID), 
    pcp.PCP, 
    plans.PlanID, 
    StartDate = MIN(ISNULL(pcp.Date, plans.Date)), 
    EndDate = MAX(ISNULL(pcp.Date, plans.Date)) 
FROM MemberPCPExpanded pcp 
    FULL JOIN MemberPlansExpanded plans 
    ON pcp.MemberID = plans.MemberID AND pcp.Date = plans.Date 
GROUP BY 
    ISNULL(pcp.MemberID, plans.MemberID), 
    pcp.PCP, 
    plans.PlanID 
ORDER BY 
    MemberID, 
    StartDate 

你可以試試這個查詢at SQL Fiddle