2013-09-24 130 views
2

我一直在抨擊我的頭這一段時間現在我越來越行不通快;數據必須保持在行級別。TSQL保持有效複製和刪除無效的重複

我想保持與最早到達時,重複有效的數據。 Load1表示一個batchID。並不是所有的值有重複

我想回到

Code1 Code2 Code3 Load1 LoadTime 
a1  a1  a1  1  2013-09-10 
a1  a1  a1  1  2013-09-10 
a1  a1  a1  1  2013-09-10 
a2  a1  a1  2  2013-09-12 
a1  a2  a1  3  2013-09-13 
a1  a2  a1  3  2013-09-13 

有什麼建議?

CREATE TABLE #Test (
Code1 varchar(10), 
Code2 varchar(10), 
Code3 varchar(10), 
Load1 varchar(10), 
LoadTime DATE 
) 


    INSERT INTO #Test 
    VALUES ('a1','a1','a1','1','2013-09-10') --Keep 

    INSERT INTO #Test 
    VALUES ('a1','a1','a1','1','2013-09-10') --Keep 

    INSERT INTO #Test 
    VALUES ('a1','a1','a1','1','2013-09-10') --Keep 

    INSERT INTO #Test 
    VALUES ('a1','a1','a1','2','2013-09-11') --Delete 

    INSERT INTO #Test 
    VALUES ('a2','a1','a1','2','2013-09-12') --Keep 

    INSERT INTO #Test 
    VALUES ('a2','a1','a1','3','2013-09-13') --Delete 

    INSERT INTO #Test 
    VALUES ('a1','a2','a1','3','2013-09-13') --Keep 

    INSERT INTO #Test 
    VALUES ('a1','a2','a1','3','2013-09-13') --Keep 

    INSERT INTO #Test 
    VALUES ('a1','a2','a1','4','2013-09-13')-- Delete 

    INSERT INTO #Test 
    VALUES ('a1','a2','a1','4','2013-09-13')-- Delete 
+0

什麼是無效的重複? –

+0

我意識到我已經非常嚴肅地問了這個問題。我將不得不重寫它。謝謝 – pekingducksoup

回答

0

您可以使用SQL Server common table expression or CTE

with cte as (
    select 
     dense_rank() over(partition by Code1, Code2, Code3 order by LoadTime, Load1 asc) as rn 
    from Table1 
) 
delete from cte where rn > 1 

sql fiddle demo

其實這個查詢在SQL Server中很容易的,因爲SQL Server將簡單公用表表達式爲可更新的觀點 - 你不必參加CTE您的原始表,你可以delete from cte

+0

謝謝,作品像一個魅力 – pekingducksoup

0

你可能想看看row_number()dense_rank()

這很難說像

;with cte as (
     select *, 
     dense_rank() over (partition by code1,code2,code3 order by loadtime) rn 
     from #test) 
    delete #Test 
    from #Test t 
     inner join cte 
      on t.Code1 = cte.Code1 
      and t.Code2 = cte.Code2 
      and t.Code3 = cte.Code3 
      and t.Load1 = cte.Load1 
      and t.LoadTime = cte.LoadTime 
     where rn>1 

刪除或從樣本數據保持邏輯的事,但(聯接是多少如果您的數據具有唯一ID,則更容易)