2015-11-24 59 views
0

我試圖從名稱和電子郵件地址列表中刪除重複項。我的腳本的最後一個陳述是一個更新需要比它應該更長的時間 - 我從來沒有真正等待它完成。如果我放置一個PRINT'任何東西';聲明直接在它前面,它立即返回。使用或不使用WHILE()循環的情況相同。T-SQL UPDATE在打印前加上緩慢,快速

這是我希望說明問題的簡化版本。我實際上正在創建一個Table-Valued函數,所以我不能將PRINT留在那裏。 PRINT可能會有什麼影響?

的SQL Server 10.50.4033(2008 R2)

DECLARE @duplicate_names TABLE (
    dnDuplicateKey int, 
    dnPrimaryKey int, 
    PRIMARY KEY (
     dnPrimaryKey, 
     dnDuplicateKey 
    ) 
); 

DECLARE @matches TABLE (
    mFirstKey int, 
    mSecondKey int, 
    PRIMARY KEY (
     mFirstKey, 
     mSecondKey 
    ) 
); 

--Find Email matches 
INSERT INTO @matches 
SELECT DISTINCT 
    f.elKey, 
    s.elKey 
FROM 
    Emails f INNER JOIN 
     Emails s 
    ON f.elEMail = s.elEMail; 

--Find name matches 
INSERT INTO @matches 
SELECT 
    f.NameKey, 
    s.NameKey 
FROM 
    Names f INNER JOIN 
     Names s 
    ON f.Name = s.Name 
WHERE 
    NOT EXISTS (
     SELECT 
      * 
     FROM 
      @Matches 
     WHERE 
      mFirstKey = f.NameKey 
      AND mSecondKey = s.NameKey 
    ) 

--Condense duplicate matches 
-- 1 = 2, 
-- 2 = 1, 
-- 3 = 4, 
-- 4 = 3 
--to 
-- 1 = 2, 
-- 3 = 4 
INSERT INTO @duplicate_names 
SELECT 
    mSecondKey, 
    MIN(mFirstKey) 
FROM 
    @matches 
GROUP BY 
    mSecondKey; 

--Condense chained matches 
-- 1 = 2, 
-- 2 = 3, 
-- 3 = 4 
--to 
-- 1 = 2, 
-- 1 = 3, 
-- 1 = 4 
WHILE(@@ROWCOUNT > 0) 
    UPDATE 
     d 
    SET 
     d.dnPrimaryKey = f.dnPrimaryKey 
    FROM 
     @duplicate_names d INNER JOIN (
      @duplicate_names f INNER JOIN 
       @duplicate_names s 
      ON f.dnDuplicateKey = s.dnPrimaryKey 
     ) ON d.dnDuplicateKey = s.dnDuplicateKey 
    WHERE 
     d.dnPrimaryKey <> f.dnPrimaryKey; 
+0

但是當然,PRINT語句導致@@ ROWCOUNT返回0,這會跳過WHILE()循環。但是我已經看到了在其他情況下很快運行,我只是讓PRINT的東西紅鯡魚我。 現在它似乎是一個表變量(@Table)和一個臨時表(#Table)之間的區別。如果我在運行UPDATE之前將結果轉儲到臨時表中,它會很快返回。但是我不能在表值函數中使用臨時表,所以我需要採取不同的方法。 –

回答

0

所以,是的,這是一個表變量和臨時表之間只是一個性能上的差異。對於感興趣的,這裏是我如何解決它,而不使用臨時表。它仍然比第一種方法慢,但由於我無法在表值函數中使用臨時表,所以它是我看到的唯一選項。

DECLARE @duplicate_names TABLE (
    dnDuplicateKey int, 
    dnPrimaryKey int, 
    PRIMARY KEY (
     dnPrimaryKey, 
     dnDuplicateKey 
    ) 
); 

DECLARE @matches TABLE (
    mFirstKey int, 
    mSecondKey int, 
    PRIMARY KEY (
     mFirstKey, 
     mSecondKey 
    ) 
); 

--Find Email matches 
INSERT INTO @matches 
SELECT DISTINCT 
    f.elKey, 
    s.elKey 
FROM 
    Emails f INNER JOIN 
     Emails s 
    ON f.elEMail = s.elEMail; 

--Find name matches 
INSERT INTO @matches 
SELECT 
    f.NameKey, 
    s.NameKey 
FROM 
    Names f INNER JOIN 
     Names s 
    ON f.Name = s.Name 
WHERE 
    NOT EXISTS (
     SELECT 
      * 
     FROM 
      @matches 
     WHERE 
      mFirstKey = f.NameKey 
      AND mSecondKey = s.NameKey 
    ); 

--Expand orphaned matches (for which no reciprocal version exists). 
WHILE(@@ROWCOUNT > 0) 
    INSERT INTO @matches 
    SELECT DISTINCT 
     f.mFirstKey, 
     s.mSecondKey 
    FROM 
     @matches f INNER JOIN 
      @matches s 
     ON f.mSecondKey = s.mFirstKey 
    WHERE 
     s.mSecondKey <> f.mFirstKey 
     AND NOT EXISTS (
      SELECT 
       * 
      FROM 
       @matches d 
      WHERE 
       d.mFirstKey = f.mFirstKey 
       AND d.mSecondKey = s.mSecondKey 
     ); 

--Condense duplicate matches 
-- 1 = 2, 
-- 2 = 1, 
-- 3 = 4, 
-- 4 = 3 
--to 
-- 1 = 2, 
-- 3 = 4 
INSERT INTO @duplicate_names 
SELECT 
    mSecondKey, 
    MIN(mFirstKey) 
FROM 
    @matches 
GROUP BY 
    mSecondKey;