2017-07-31 88 views
-3

取下逗號分隔的重複清單,正則表達式

contract, clause 1, Subsection 1.1, contract, clause 1, Subsection 1.2, 
paragraph (a), contract, clause 1, Subsection 1.2, paragraph (b), contract, 
clause 2 

,我想

contract, clause 1, Subsection 1.1, Subsection 1.2, paragraph (a), paragraph 
(b), clause 2 

我發現,正則表達式可以做到這一點,但我找不到要使用的字符串做

請幫助..

+0

人可能會不願意幫助,除非你嘗試並在此發佈你的嘗試。 –

回答

1

基於this link分裂逗號分隔值到行,我分裂串入行,保持中第一次出現的位置,取得了明顯的一個再聚集值

with test_string as ( 
select 1 as id, 
'contract, clause 1, Subsection 1.1, contract, clause 1, Subsection 1.2, paragraph (a), contract, clause 1, Subsection 1.2, paragraph (b), contract, clause 2' val 
from dual) 
select id, listagg(word,', ') WITHIN GROUP (order by position) FROM (
select distinct id, first_value(position) over (partition by word order by position) position, word from (
select 
    distinct t.id, 
    levels.column_value as position, 
    trim(regexp_substr(t.val, '[^,]+', 1, levels.column_value)) as word 
from 
    test_string t, 
    table(cast(multiset(select level from dual connect by level <= length (regexp_replace(t.val, '[^,]+')) + 1) as sys.OdciNumberList)) levels 
) 
) GROUP BY id 

如果你不感興趣,維持秩序

with test_string as ( 
select 1 as id, 
'contract, clause 1, Subsection 1.1, contract, clause 1, Subsection 1.2, paragraph (a), contract, clause 1, Subsection 1.2, paragraph (b), contract, clause 2' val 
from dual) 
select id, listagg(word,', ') WITHIN GROUP (order by 1) FROM (
select 
    distinct t.id, 
    trim(regexp_substr(t.val, '[^,]+', 1, levels.column_value)) as word 
from 
    test_string t, 
    table(cast(multiset(select level from dual connect by level <= length (regexp_replace(t.val, '[^,]+')) + 1) as sys.OdciNumberList)) levels 
) GROUP BY id 
+0

也許有一個更簡單的解決方案,但它並沒有出現在我的腦海裏 – LauDec

+0

沒有看到這裏的答案是相當相同的(https://stackoverflow.com/questions/40259200/how-to-remove-duplicates- from-space-separated-list-by-oracle-regexp-replace)可能更好 – LauDec

+0

非常感謝@LauDec,你解決了我的問題... –

相關問題