2013-12-15 69 views
0

如何鏈接PostrgreSQL中的重複記錄?我發現他們:PostgreSQL中的鏈接重複記錄

SELECT * FROM (
    SELECT id, import_id, name, 
    ROW_NUMBER() OVER(PARTITION BY address ORDER BY name asc) AS Row 
    FROM companies 
) dups 
where 
dups.Row > 1 ORDER BY dups.name; 

見示例代碼和演示在http://sqlfiddle.com/#!15/af016/7/1

我要添加一列名爲公司linked_id,將被設置爲第一每組中的import_id重複記錄。

+0

不使用「行」作爲列別名可能是一個好主意。這很混亂,而且在某些情況下它也是一個關鍵詞。順便說一句,如果你將一些樣本數據/模式作爲「CREATE TABLE」和「INSERT」語句發佈,像這樣的問題很容易回答。 SQLFiddle.com可以很方便,並有一個文本到SQL轉換工具。 –

+0

「行」來自我在http://stackoverflow.com/questions/14471179 – Circuitsoft

回答

1

嘗試:

UPDATE companies c 
SET import_id = q.import_id 
FROM (
    SELECT id, 
    FIRST_VALUE(import_id) 
     OVER(PARTITION BY name, address ORDER BY name asc) AS import_id, 
    ROW_NUMBER() 
     OVER(PARTITION BY name, address ORDER BY name asc) AS Rn 
    FROM companies 
) q 
WHERE c.id = q.id AND q.rn > 1 
; 

演示:http://sqlfiddle.com/#!15/af016/10

+0

發現的例子,我發現我的答案就像你發佈你的。謝謝! – Circuitsoft

1

這將設置PARENT_ID第一家公司的IMPORT_ID相匹配。

UPDATE companies 
SET parent_id=rs.parent_id FROM 
(SELECT id, first_value(import_id) 
OVER (PARTITION BY address ORDER BY name) as parent_id 
FROM companies 
) AS rs 
WHERE rs.id=companies.id;