2011-05-09 54 views
2

我設法組成了sql查詢,以將包含組合的重複行更新爲2字段表的空值。但是,我堅持超過2場表。在Oracle上識別n字段數據表的重複組合

我的2場的解決方案是:

插入測試數據組合表:

create table combinations as 
select 1 col1, 2 col2 from dual --row1 
union all 
select 2, 1 from dual --row2 
union all 
select 1, 3 from dual --row3 
union all 
select 1,4 from dual; --row4

從組合表ROW1和ROW2是重複的,因爲元素的順序並不重要。

更新複製組合爲null 2場(更新2行是null):

update combinations 
set col1=null, col2=null 
where rowid IN(
select x.rid from (
    select 
     rowid rid, 
     col1, 
     col2, 
     row_number() over (partition by least(col1,col2), greatest(col1,col2) 
           order by rownum) duplicate_row 
    from combinations) x 
where duplicate_row > 1); 

我上面的代碼依賴於最少(,)和最大()函數,這就是爲什麼它的作品整齊。任何想法將此代碼調整爲3字段表?

爲組合2' 表中插入測試數據(3-場)

create table combinations2 as 
select 1 col1, 2 col2, 3 col3 from dual --row1 
union all 
select 2, 1, 3 from dual --row2 
union all 
select 1, 3, 2 from dual --row3; 

組合2表3場具有ROW1,ROW2,ROW3它們是相等的。我的目標是將row2和row3更新爲null。

+1

好像是這個一樣的問題:http://stackoverflow.com/questions/5924118/sql-and-unique-n-coulmn-combinations – 2011-05-09 14:51:47

回答

1
update combinations2 
set col1 = NULL 
    , col2 = NULL 
    , col3 = NULL 
where rowid in (
      select r 
      from 
       (
       -- STEP 4 
       select r, row_number() over(partition by colls order by colls) duplicate_row 
       from 
        (
        -- STEP 3 
        select r, c1 || '_' || c2 || '_' || c3 colls 
        from 
         (
         -- STEP 2 
         select r 
           , max(case when rn = 1 then val else null end) c1 
           , max(case when rn = 2 then val else null end) c2 
           , max(case when rn = 3 then val else null end) c3 
         from 
          (
          -- STEP 1 
          select r 
            , val 
            , row_number() over(partition by r order by val) rn 
          from 
           (
            select rowid as r, col1 as val 
            from combinations2 
           union all 
            select rowid, col2 
            from combinations2 
           union all 
            select rowid, col3 
            from combinations2 
           ) 
          ) 
         group by r 
         ) 
        ) 
       ) 
      where duplicate_row > 1 
      ) 
; 
  • 步驟1:在列中的值進行排序
  • 步驟2:建立排,排序的值
  • 步驟3:串聯列的字符串
  • 步驟4:查找重複
+0

不錯,但你需要一點點修改,它不工作,如果你添加更多的值,例如:1,4,2和4,2,1 – mcha 2011-05-09 15:01:58

+0

在步驟4中它應該是row_number()over(由colls ORDER BY colls分區) – mcha 2011-05-09 15:08:52

+0

感謝您的報告。格式化時我以某種方式丟失了它。我編輯了我的答案。現在它應該做得很好。 – schurik 2011-05-09 15:54:22