2013-02-21 38 views
0

您好我正在運行follwoing查詢來識別重複的記錄。確定重複以及在oracle中匹配的唯一記錄

SELECT * 
      FROM unique2 P WHERE EXISTS(SELECT 1 FROM unique2 C 
            WHERE ((C.surname) = (P.surname)) 
             AND ((C.postcode) = (P.postcode)) 
             AND ((((C.forename) IS NULL OR (P.forename) IS NULL) 
             AND (C.initials) = (P.initials)) 
             OR (C.forename) = (P.forename)) 
             AND ((C.sex) = (P.sex) 
             OR (C.title) = (P.title)) 
             AND (((C.address1))=((P.address1)) 
             OR ((C.address1))=((P.address2)) 
             OR ((C.address2))=((P.address1)) 
             OR instr(C.address1_notrim, P.address1_notrim) > 0 
             OR instr(P.address1_notrim, C.address1_notrim) > 0) 
             AND C.rowid < P.rowid); 

但是,使用此查詢我無法識別與重複記錄匹配的唯一記錄ID。有沒有一種方法來識別 重複以及唯一的記錄ID(我的表具有唯一鍵)這些重複匹配?

回答

1
select id 
from promolog 
where surname, postcode, dob in (
    select surname, postcode,dob 
    from (
    select surname, postcode, dob, count(1) 
    from promolog 
    group by surname,postcode,dob 
    having count(1) > 1 
) 
) 
+0

嗨,謝謝你的迴應。但我有一些其他規則來識別重複的內容,如:a。 b。DOB和 b。 \t郵編AND c。 \t姓氏和 d。 \t地址 i。 \t mailed_address1 = mailed_address1 ii。 \t或mailed_address1 = mailed_address2 iii。 \t或mailed_address2 = mailed_address1 iv。 \t或記錄1的mailed_address1中的記錄1的mailed_address1 v。\t或記錄1的mailed_address1中記錄2的mailed_address1 – subash 2013-02-21 14:22:16

+0

@subash只是在此查詢中添加/更改需要比較的任何字段(我使用了原始帖子中的3個字段:姓氏,郵編,dob)。 – tbone 2013-02-21 14:32:50

+0

@subash:如果你更新你的問題,它會更好。你得到什麼問題。 – 2013-02-21 14:33:29

1

您還可以使用分析函數做到這一點:

select id, num_of_ids, first_id, surname, postcode, dob 
from (
    select id, 
     count(*) over (partition by surname, postcode, dob) as num_of_ids, 
     first_value(id) 
      over (partition by surname, postcode, dob order by id) as first_id, 
     surname, 
     postcode, 
     dob 
    from promolog 
) 
where num_of_ids > 1; 

根據您的更新,我覺得你可以做一個自連接,它可以使你的那麼複雜像:

select dup.*, master.id as duplicate_of 
from promolog dup 
join promolog master 
on master.surname = dup.surname 
and master.postcode = dup.postcode 
and master.dob = dup.dob 
... and <address checks etc. > ... 
and master.rowid < dup.rowid; 

但也許我仍然失去了一些東西。顧名思義,exists用於測試匹配記錄的存在性;如果您想從匹配的記錄中檢索任何數據,那麼您需要在某個時刻加入它。

相關問題