2017-01-31 27 views
1

我正在爲處理排名競爭的公司工作。SQL過程:刪除重複項,重新分配外鍵引用

不幸的是,他們的成員表對郵件沒有唯一約束,有的用戶已經創建與每個種族或團隊他們是相同的電子郵件新的帳戶。

我希望把唯一約束在列,以防止在未來的任何重複,但...

問題:我怎樣才能刪除重複使用單個查詢不失連接到他們的數據?

我認爲它與更新所有外鍵以匹配用戶的一個實例然後刪除重複項有關。

澄清: 在下面的例子中,標記的行是指與ID的重複成員:03,04,05和06 在這種情況下,解決辦法是:

  1. 外鍵引用將ID 03和05更改爲01.
  2. 帶ID 04和06的外鍵引用更改爲02.
  3. 已刪除ID爲03,04,05和06的重複成員。

但是,這怎麼能在MSSQL中完成?

Member table 
ID | Username | Gender | Email 
01 | User1 | Male | [email protected] 
02 | User2 | Female | [email protected] 
*03 | User3 | Male | [email protected] 
*04 | User4 | Female | [email protected] 
*05 | User5 | Male | [email protected] 
*06 | User6 | Female | [email protected] 


MemberToTeam table 
MemberID_fk | TeamID_fk 
01   | 01 
02   | 01 
*03   | 02 
*04   | 02 
*05   | 03 
*06   | 03 

RaceRank table 
RaceID_fk | MemberID_fk | Ranking 
01  | 01   | 12 
01  | 02   | 1 
*02  | 03   | 5 
*02  | 04   | 7 
*03  | 05   | 4 
*03  | 06   | 9 

感謝您的幫助。

+3

請出示一些樣本數據去與你的解釋。 –

+0

嗨,我更新了我的問題。感謝您的評論。 :) – LiHRaM

+0

您的方案中的步驟正是您需要做的。更新外鍵引用,然後刪除重複項。不確定你在問什麼。這是一條UPDATE語句,後面跟着一條DELETE語句。 –

回答

2

這在一個查詢中進行。重複其他表格。

with FAKES as 
(
select Email 
from Member 
group by Email 
having count(id) >1 
), 
FAKE_ID as 
(
select id, email, row_number() over(partition by email order by id) as c_id 
from Member 
where email in (select Email from FAKES) 
) 
, 
DEDUP as 
(
select fi.id, f2.id as val_id 
from FAKE_ID fi 
inner join FAKE_ID f2 
    on fi.email = f2.email 
where fi.c_id > 1 
and f2.c_id = 1 
) 
update mt 
set mt.MemberID_fk = dd.val_id 
from MemberToTeam mt 
inner join DEDUP dd 
on dd.id = mt.MemberID_fk; 

測試here

0

您可能需要更新每個通過外鍵鏈接到成員表的其他表。

你可以選擇單條記錄依靠在成員表,出的所有共享相同的電子郵件地址的記錄,然後用這樣的查詢更新鏈接表:

update myreferencetable set memberid = [the single instance of the member] 
where memberid in (select memberid from member where email = [email address with duplicates] 
+0

嘿洛林!如果您可以編寫該查詢,那麼我不必按成員更新成員表成員,我會接受它。處理數千條記錄需要的時間比我將聘用這家公司時間要長。 ;) – LiHRaM

2

此代碼將解決這個問題

--MemberToTeam 
;with cte_dupes as 
(
select ID,Email, 
    row_number() over (partition by Email order by Email) rn 
from Member 
) 
update mt 
    set MemberID_fk = (select cte.ID from cte_dupes cte where rn=1 and cte.Email = m.Email) 
from MemberToTeam mt 
inner join Member m on m.ID = mt.MemberID_fk 
inner join cte_dupes cte on cte.ID = mt.MemberID_fk and cte.rn > 1; 


--RaceRank 
;with cte_dupes as 
(
select ID,Email, 
    row_number() over (partition by Email order by Email) rn 
from Member 
) 
update r 
    set MemberID_fk = (select cte.ID from cte_dupes cte where rn=1 and cte.Email = m.Email) 
from RaceRank r 
inner join Member m on m.ID = r.MemberID_fk 
inner join cte_dupes cte on cte.ID = r.MemberID_fk and cte.rn > 1;