2017-08-26 36 views
0

我有這些值集團逗號分隔的重複值

SITE_NAME | CATEGORY | 
---------------------- 
SITE1 | CAR, TRAVEL 
SITE2 | TRAVEL 
SITE3 | SPORT, GAME 
SITE4 | GAME 
SITE5 | CAR 
SITE6 | TRAVEL 
SITE7 | GAME 

我也想重複合計值的表,所以我用這個:

SELECT category, COUNT (*) FROM table_db group by category having count (*)> = 1 

這個工作在分組等於「類別'值,但將'CAR,TRAVEL'視爲'CAR'以外的值,我希望它也被識別爲重複值。

這個代碼顯示了這一點:

CAR, TRAVEL 
TRAVEL 
SPORT, GAME 
CAR 
GAME 

我希望它看起來像這樣:

CAR 
TRAVEL 
SPORT 
GAME 
+6

你真的應該改變你的原始表的設計。切勿將多個值存儲在單個列中! –

+2

此架構公然違反[零度,一度或無限規則](http://en.wikipedia.org/wiki/Zero_one_infinity_rule) [數據庫標準化](http://en.wikipedia.org/wiki/Database_normalization)。如果你調整它有一些適當的正常形式,這將是微不足道的。 – tadman

+0

我無法重新設計數據庫。我想在帖子中進行解釋。 –

回答

0

雖然我完全有關數據庫設計的其他意見基本一致,如果出於某種原因,你卡住了你的設計,那麼你需要創建一個分裂功能。事情是這樣的:

CREATE FUNCTION public.fnsplit(
    IN stringlist character varying, 
    IN delimit character varying) 
    RETURNS TABLE(items character varying) AS 
$BODY$ 
declare remainderlist character varying; 
declare front character varying; 
declare delimitpos integer; 
begin 
    drop table if exists tmptbl; 
    create temp table tmptbl(items character varying); 
    remainderlist := $1; 
    delimitpos := strpos(remainderlist, $2); 
    while delimitpos > 0 loop 
     front := trim(both from(left(remainderlist, delimitpos -1))); 
     remainderlist := substr(remainderlist, delimitpos + 1); 
     if length(front) > 0 then 
      insert into tmptbl values (front); 
     end if; 
     delimitpos := strpos(remainderlist, $2); 
    end loop; 
    --insert last value 
    remainderlist := trim(both from remainderlist); 
    if length(remainderlist) > 0 then 
     insert into tmptbl values (remainderlist); 
    end if; 
    return query 
     select * from tmptbl; 
     return; 
end; 
$BODY$ 
    LANGUAGE plpgsql VOLATILE 
    COST 100 
    ROWS 1000; 

你會那麼可以使用它在你的選擇是這樣的:

SELECT category, COUNT (*) FROM 
(SELECT fnsplit(category, ', ') as category FROM table_db) d 
group by category having count(*) >= 1; 

我不禁強調,雖然,這應該是最後的手段!

編輯

有人指出,OP希望MySQL。這有點棘手,因爲MySQL不允許函數返回表。所以你必須改用臨時表。所以,現在的功能如下:

DELIMITER $$ 
CREATE PROCEDURE fnsplit(
    stringlist varchar(2000), 
    delimit varchar(20) 
) 
BEGIN 

declare remainderlist varchar(2000); 
declare front varchar(2000); 
declare delimitpos integer; 

    SET remainderlist = stringlist; 
    SET delimitpos = position(delimit in remainderlist); 
    while delimitpos > 0 do 
     SET front = trim(both from(left(remainderlist, delimitpos -1))); 
     SET remainderlist = substr(remainderlist, delimitpos + 1); 
     if length(front) > 0 then 
      insert into tblTmpSplit values (front); 
     end if; 
     SET delimitpos = position(delimit in remainderlist); 
    end while; 
    SET remainderlist = trim(both from remainderlist); 
    if length(remainderlist) > 0 then 
     insert into tblTmpSplit values (remainderlist); 
    end if; 

END$$ 
DELIMITER ; 

現在你可以這樣調用:

SET @allcategories = (SELECT GROUP_CONCAT(category separator ', ') FROM table_db); 

drop table if exists tbltmpsplit; 
create temporary table tbltmpsplit(items varchar(2000)); 

call fnsplit(@allcategories, ', '); 

SELECT *, Count(*) FROM tbltmpsplit GROUP BY items having count(*) >= 1; 

drop table if exists tbltmpsplit; 

這將返回:

CAR 2 
GAME 3 
SPORT 1 
TRAVEL 3