2017-01-23 33 views
2

所以我有SQL將添加到字段的代碼,如果它檢測到重複。有一個叫DS如何從我的文件中刪除重複項,而某些字段優先?

另一場

DS可以是「是」,也可以是「不」

我怎樣才能使它所以如果發現重複,「是」不編碼和「不」是什麼?

基本上'是'會獲得優先權。

我的SQL:

WITH cte 
    AS (SELECT *, 
       Row_Number() OVER(partition BY fips_county_code, last, suffix, first, birthdate Order by (select null)) AS Rn 
     FROM [PULLED REC]) 
UPDATE cte 
SET BAD_CODES = Isnull(BAD_CODES, '') + 'D' 
WHERE RN > 1; 
+0

爲什麼能」你是否將DS設置爲「否」? – Anand

+0

我不明白你在這裏要做什麼。你想更新重複的所有行嗎?你是否忽略了DS列以確定它是否是重複的?如果沒有表格的一些細節和你想要做的事情,這真的很難回答。 –

+0

@SeanLange我想更新所有行的重複,而是說,如果我們有兩個記錄 名稱:山姆DS:是 名:山姆DS:沒有 然後我們唯一的代碼的DS =沒有。 – rohanharrison

回答

2

只更新行,其中ds='No'可以添加到where條款。

要確保rn > 1沒有跳過重複的內容,你需要更新的一個,您可以使用exists()替代count()

with cte as (
    select 
     * 
    , rn = row_number() over (
      partition by fips_county_code, last, suffix, first, birthdate 
      order by (case when DS = 'yes' then 0 else 1 end) asc 
      ) 
    from [pulled rec] 
) 
/* -- check with select first -- */ 
select * from cte 

/* 
update cte set 
    bad_codes = isnull(bad_codes, '') + 'D' 
--*/ 
/* -- Update all records that have a duplicate 
    -- except the First row, ordered by ds='Yes' first */ 
/* 
    where cte.ds = 'No' 
    and cte.rn > 1 
--*/ 
-- Update all records that have a duplicate and ds='No' -- 
--/* 
    where cte.ds = 'No' 
    and exists (
     select 1 
     from cte as i 
     where i.rn > 1 
      and i.fips_county_code = cte.fips_county_code 
      and i.last = cte.last 
      and i.suffix = cte.suffix 
      and i.first = cte.first 
      and i.birthdate = cte.birthdate 
    ); 
--*/ 
使用 count() over()

替代版本:

with cte as (
    select 
     * 
    , CountOver = count() over (
      partition by fips_county_code, last, suffix, first, birthdate 
      ) 
    from [pulled rec] 
) 
/* -- check with select first -- */ 
select * from cte 

/* 
update cte set 
    bad_codes = isnull(bad_codes, '') + 'D' 
--*/ 

    where cte.ds = 'No' 
    and cte.CountOver > 1 
0

我想這應該讓你在正確的方向。

WITH cte 
    AS (SELECT *, 
       Row_Number() OVER(partition BY fips_county_code, last, suffix, first, birthdate Order by (select null)) AS Rn 
       , COUNT(*) as DupeCount 
     FROM [PULLED REC] 
     group by fips_county_code, last, suffix, first, birthdate --and whatever other columns are present 
) 
UPDATE cte 
SET BAD_CODES = case RN when 1 then BAC_CODES else Isnull(BAD_CODES, '') + 'D' end 
    , DS = Case DupeCount when 1 then 'no' else 'yes' end 
相關問題