2011-03-03 118 views
5

下面是我的問題的簡化示例。我有一個表,其中有一個「名稱」列有重複的條目:在這個重命名重複的行

ID Name 
--- ---- 
1 AAA 
2 AAA 
3 AAA 
4 BBB 
5 CCC 
6 CCC 
7 DDD 
8 DDD 
9 DDD 
10 DDD 

做一個GROUP BY像SELECT Name, COUNT(*) AS [Count] FROM Table GROUP BY Name結果:

Name Count 
---- ----- 
AAA 3 
BBB 1 
CCC 2 
DDD 4 

我只關心重複,所以我「會增加一個HAVING子句,SELECT Name, COUNT(*) AS [Count] FROM Table GROUP BY Name HAVING COUNT(*) > 1

Name Count 
---- ----- 
AAA 3 
CCC 2 
DDD 4 

瑣碎的,到目前爲止,但現在事情變得棘手:我需要一個查詢來獲取我所有的重複記錄,但有一個很好的遞增中dicator添加到名稱列。結果應該是這個樣子:

ID Name 
--- -------- 
1 AAA 
2 AAA (2) 
3 AAA (3) 
5 CCC 
6 CCC (2) 
7 DDD 
8 DDD (2) 
9 DDD (3) 
10 DDD (4) 

注排4「BBB」被排除在外,而第一個重複的保持原有名稱。

使用EXISTS聲明爲我提供了所需的所有記錄,但我如何着手創建新的Name值?

SELECT * FROM Table AS T1 
WHERE EXISTS (
    SELECT Name, COUNT(*) AS [Count] 
    FROM Table 
    GROUP BY Name 
    HAVING (COUNT(*) > 1) AND (Name = T1.Name)) 
ORDER BY Name 

我需要創建一個UPDATE語句來修復所有重複項,即根據此模式更改名稱。

更新: 現在想通了。這是我錯過的PARTITION BY條款。

回答

10
With Dups As 
    (
    Select Id, Name 
     , Row_Number() Over (Partition By Name Order By Id) As Rnk 
    From Table 
    ) 
Select D.Id 
    , D.Name + Case 
       When D.Rnk > 1 Then ' (' + Cast(D.Rnk As varchar(10)) + ')' 
       Else '' 
       End As Name 
From Dups As D 

如果你想要一個更新語句,您可以使用幾乎相同的結構:

With Dups As 
    (
    Select Id, Name 
     , Row_Number() Over (Partition By Name Order By Id) As Rnk 
    From Table 
    ) 
Update Table 
Set Name = T.Name + Case 
        When D.Rnk > 1 Then ' (' + Cast(D.Rnk As varchar(10)) + ')' 
        Else '' 
        End 
From Table As T 
    Join Dups As D 
     On D.Id = T.Id 
+0

我挖掘CTE。我希望我會更頻繁地使用它們。 – SQLMason 2011-03-03 04:00:47

+0

您的解決方案看起來更整潔,但我無法讓它工作。 – 2011-03-03 04:53:36

+0

@Jakob Gade-你是否收到錯誤或是不是產生你想要的結果? – Thomas 2011-03-03 05:31:52

1
SELECT ROW_NUMBER() OVER(ORDER BY Name) AS RowNum, 
     Name, 
     Name + '(' + ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Name) + ')' concatenatedName 
FROM Table 
WHERE Name IN 
(
    SELECT Name 
    FROM Table 
    GROUP BY Name 
    HAVING COUNT(*) > 1 
) 

這會給你你最初的​​要求。對於更新語句,你會想要做了一段時間,更新前1

DECLARE @Pointer VARCHAR(20), @Count INT 

WHILE EXISTS(SELECT Name FROM Table GROUP BY Name HAVING COUNT(1) > 1) 
BEGIN 
    SELECT TOP 1 @Pointer = Name, @Count = COUNT(1) FROM Table GROUP BY Name HAVING COUNT(1) > 1 
    UPDATE TOP (1) TABLE 
    SET Name = Name + '(' + @Count + ')' 
    WHERE Name = @Pointer 
END 
+0

對不起,我要去記憶。我的筆記本上沒有SSMS。您可以將連接的名稱包裝在案例中以刪除第一個(1)。 – SQLMason 2011-03-03 03:41:02

+0

優秀,它的作品!只需按照您的建議爲行號和CASE添加一個CAST即可。 – 2011-03-03 03:54:23

+0

謝謝忘記CAST。如果我在這裏安裝了SSMS,我會測試它。 – SQLMason 2011-03-03 03:58:23

0

根本沒有必要做UPDATE。下面將創建表INSERT根據需要

SELECT 
    ROW_NUMBER() OVER(ORDER BY tb2.Id) Id, 
    tb2.Name + CASE WHEN COUNT(*) > 1 THEN ' (' + CONVERT(VARCHAR, Count(*)) + ')' ELSE '' END [Name] 
FROM 
    tb tb1, 
    tb tb2 
WHERE 
    tb1.Name = tb2.Name AND 
    tb1.Id <= tb2.Id 
GROUP BY 
    tb2.Name, 
    tb2.Id 
4

只需更新直接的子查詢:

update d 
set Name = Name+'('+cast(r as varchar(10))+')' 
from ( select Name, 
        row_number() over (partition by Name order by Name) as r 
      from [table] 
     ) d 
where r > 1 
+0

這甚至更好! – SQLMason 2011-03-03 17:37:57

+0

此方法也適用於刪除重複項。 – 2011-03-04 05:51:13

0

這裏有一個更簡單的UPDATE語句:

UPDATE 
    tb 
SET 
    [Name] = [Name] + ' (' + CONVERT(VARCHAR, ROW_NUMBER() OVER (PARTITION BY [Name] ORDER BY Id)) + ')' 
WHERE 
    ROW_NUMBER() OVER (PARTITION BY [Name] ORDER BY Id) > 1 
+0

謝謝。但我嘗試過,但我無法實現。錯誤消息顯示「窗口函數只能出現在SELECT或ORDER BY子句中。」 – 2011-03-08 02:57:42