2015-11-11 32 views
0

我在使用PostgreSQL檢查數據的唯一性方面存在問題。我有百姓餐桌它具有以下數據:使用PostgreSQL檢查3列數據的唯一性

id | identifier | first_name | middle_name | last_name |  email  |   created_at   |   updated_at   
----+------------+------------+-------------+-----------+-----------------+----------------------------+---------------------------- 
    1 | identifier | First  | A.   | Last  | [email protected] | 2015-11-11 14:46:17.782689 | 2015-11-11 14:46:17.782689 
    2 | identifier | First 2 | M.   | Last 2 | [email protected] | 2015-11-11 14:46:17.790697 | 2015-11-11 14:46:17.790697 
(2 rows) 

現在我想找到記錄的IDS,其標識符屬於多個FIRST_NAME - 姓氏組合。所以在這個例子中,我們有兩個記錄具有相同的標識符但名字和姓氏不同。我試圖檢查是否存在用下面的SQL任何重複,但這並不返回任何東西:

SELECT 
    identifier, first_name, last_name, COUNT(*) 
FROM 
    people 
GROUP BY 
    identifier, first_name, last_name 
HAVING 
    COUNT(*) > 1 
+2

你應該從'select'和'group by'中排除'first_name,last_name' – potashin

回答

1

隨着grouping

select * from people 
where identifier in(select identifier from people 
        group by identifier 
        having count(distinct first_name) > 1 or 
          count(distinct last_name) > 1) 

exists

select * from people p1 
where exists(select * from people p2 
      where p1.identifier = p2.identifier and 
        (p1.first_name <> p2.first_name or p1.last_name <> p2.last_name)) 
2

如果您只是想複製標識符

select identifer 
from people p 
group by identifer 
having count(*) > 1; 

如果你想標識,其中名稱是不同的:

select identifer 
from people p 
group by identifer 
having min(first_name) <> max(first_name) or 
     min(last_name) <> max(last_name); 

(或:having count(distinct first_name, last_name) > 1。)

如果你想原始行,我會用窗函數:

select p.* 
from (select p.*, 
      min(first_name) over (partition by identifer) as minfn, 
      max(first_name) over (partition by identifer) as maxfn, 
      min(last_name) over (partition by identifer) as minln, 
      min(last_name) over (partition by identifer) as maxln, 
     from people 
    ) p 
where minfn <> maxfn or minln <> maxln; 

如果Postgres支持count(distinct)作爲窗口函數,這將更容易。