2011-10-12 52 views
2

我需要創建一個進程,從表中提取更改,其中每行是另一個表中某一行的快照。現實世界中的問題涉及許多領域的許多表,但作爲一個簡單的例子,假設我有以下的快照數據:如何編寫查詢以從數據快照中提取單個更改?

Sequence DateTaken  ID  Field1 Field2 
-------- ----------- ----  ------ ------ 
     1 '2011-01-01'  1  'Red'   2 
     2 '2011-01-01'  2  'Blue'  10 
     3 '2011-02-01'  1  'Green'  2 
     4 '2011-03-01'  1  'Green'  3 
     5 '2011-03-01'  2  'Purple'  2 
     6 '2011-04-01'  1  'Yellow'  2 

SequenceDateTaken領域直接關係到快照表本身。所述ID字段是源表的主鍵和Field1Field2和在相同的(源)表等領域。

我可以得到部分的方式與這樣的查詢的解決方案:

WITH Snapshots (Sequence, DateTaken, ID, Field1, Field2, _Index) 
AS 
(
    SELECT Sequence, DateTaken, ID, Field1, Field2, ROW_NUMBER() OVER (ORDER BY ID, Sequence) _Index 
    FROM #Snapshots 
) 
SELECT 
     c.DateTaken, c.ID 
    , c.Field1 Field1_Current, p.Field1 Field1_Previous, CASE WHEN c.Field1 = p.Field1 THEN 0 ELSE 1 END Field1_Changed 
    , c.Field2 Field2_Current, p.Field2 Field2_Previous, CASE WHEN c.Field2 = p.Field2 THEN 0 ELSE 1 END Field2_Changed 
FROM Snapshots c 
JOIN Snapshots p ON p.ID = c.ID AND (p._Index + 1) = c._Index 
ORDER BY c.Sequence DESC 

上面的查詢將確定哪些是從一個快照改變爲下一,但它仍然不是在形式我需要。輸出中的每一行都可能包含多個更改。在一天結束時,我需要每行更改一行,以確定哪些字段已更改,以及其以前的/當前值。實際上沒有改變的字段將需要從最終輸出中排除。因此,如果上述查詢的輸出是這樣的:

DateTaken ID Field1_Current Field1_Previous Field1_Changed Field2_Current Field2_Previous Field2_Changed 
---------- -- -------------- --------------- -------------- -------------- --------------- -------------- 
2011-04-01 1 Yellow   Green   1    2    3    1 
2011-02-01 1 Green   Red    1    2    2    0 

我需要變換成這樣的:

DateTaken ID Field Previous Current 
---------- -- ------- -------- --------- 
2011-04-01 1 Field1 Green  Yellow 
2011-04-01 1 Field2 3   2 
2011-02-01 1 Field1 Red  Green 

我想我也許能UNPIVOT到那裏,但我無法完成這項工作。我認爲任何涉及遊標或類似的解決方案都是絕對的最後手段。

感謝任何建議。

+1

我有同樣的問題。請參閱http:// stackoverflow。com/questions/6348405 /更好的方式到部分unsivot在對SQL注意,你需要投,因爲你將有一個數據類型的問題 –

回答

3

這是一個使用UNPIVOT的工作示例。它是基於我的回答我的問題Better way to Partially UNPIVOT in Pairs in SQL

這有一些不錯的功能。

  1. 添加其他字段很容易。只需將值添加到SELECT和UNPIVOT子句。您不必添加其他UNION子句

  2. 無論添加了多少個字段,Where子句WHERE curr.value <> prev.value都不會更改。

  3. 的表現出奇的快。

  4. 其移植到甲骨文的當前版本,如果你需要的是


SQL

Declare @Snapshots as table(
Sequence int, 
DateTaken  datetime, 
[id] int, 
field1 varchar(20), 
field2 int) 



INSERT INTO @Snapshots VALUES 

     (1, '2011-01-01',  1,  'Red',   2), 
     (2, '2011-01-01',  2,  'Blue',  10), 
     (3, '2011-02-01',  1,  'Green',  2), 
     (4, '2011-03-01',  1,  'Green' ,  3), 
     (5, '2011-03-01',  2,  'Purple',  2), 
     (6, '2011-04-01',  1,  'Yellow',  2) 

;WITH Snapshots (Sequence, DateTaken, ID, Field1, Field2, _Index) 
AS 
(
    SELECT Sequence, DateTaken, ID, Field1, Field2, ROW_NUMBER() OVER (ORDER BY ID, Sequence) _Index 
    FROM @Snapshots 
) 
, data as(
SELECT 
    c._Index 
    , c.DateTaken 
    , c.ID 
    , cast(c.Field1 as varchar(max)) Field1 
    , cast(p.Field1 as varchar(max))Field1_Previous 
    , cast(c.Field2 as varchar(max))Field2 
    , cast(p.Field2 as varchar(max)) Field2_Previous 


FROM Snapshots c 
JOIN Snapshots p ON p.ID = c.ID AND (p._Index + 1) = c._Index 
) 


, fieldsToRows 
    AS (SELECT DateTaken, 
       id, 
       _Index, 
       value, 
       field 

     FROM data p UNPIVOT (value FOR field IN (field1, field1_previous, 
                 field2, field2_previous)) 
       AS unpvt 
     ) 
SELECT 
    curr.DateTaken, 
    curr.ID, 
    curr.field, 
    prev.value previous, 
    curr.value 'current' 

FROM 
     fieldsToRows curr 
     INNER JOIN fieldsToRows prev 
     ON curr.ID = prev.id 
      AND curr._Index = prev._Index 
      AND curr.field + '_Previous' = prev.field 
WHERE 
    curr.value <> prev.value 

輸出

DateTaken    ID   field  previous current 
----------------------- ----------- --------- -------- ------- 
2011-02-01 00:00:00.000 1   Field1 Red  Green 
2011-03-01 00:00:00.000 1   Field2 2  3 
2011-04-01 00:00:00.000 1   Field1 Green Yellow 
2011-04-01 00:00:00.000 1   Field2 3  2 
2011-03-01 00:00:00.000 2   Field1 Blue  Purple 
2011-03-01 00:00:00.000 2   Field2 10  2 
+0

這是優秀的康拉德,謝謝。 –

1
WITH Snapshots (Sequence, DateTaken, ID, Field, FieldValue, _Index) AS 
(
    SELECT 
     Sequence, 
     DateTaken, 
     ID, 
     'Field1' AS Field 
     CAST(Field1 AS VARCHAR(100)) AS FieldValue, -- Find an appropriate length 
     ROW_NUMBER() OVER (ORDER BY ID, Sequence) 
    FROM 
     #Snapshots 
    UNION ALL 
    SELECT 
     Sequence, 
     DateTaken, 
     ID, 
     'Field2' AS Field 
     CAST(Field2 AS VARCHAR(100)) AS FieldValue, -- Find an appropriate length 
     ROW_NUMBER() OVER (ORDER BY ID, Sequence) 
    FROM 
     #Snapshots 
) 
SELECT 
    S1.DateTaken, 
    S1.ID, 
    S1.Field, 
    S1.FieldValue AS Previous, 
    S2.FieldValue As New -- Not necessarily "Current" 
FROM 
    Snapshots S1 
INNER JOIN Snapshots S2 ON 
    S2.ID = S1.ID AND 
    S2.Field = S1.Field AND 
    S2._Index = S1._Index + 1 AND 
    S2.FieldValue <> S1.FieldValue -- Might need to handle NULL values 
+0

這是一個非常好的解決方案,湯姆。感謝您花時間幫助我。我正在給康拉德公認的答案,因爲我認爲他的版本可以更好地適應現實世界的情況,即每張表格中有很多字段。 –

相關問題