2012-12-04 23 views
0

我有一張有200,000行的表格。我創建了一個視圖,根據不同的標準刪除此表中的數據片段,這些標準符合我對構成重複記錄的定義。我有下面這樣做的代碼,我想知道是否有人可以建議更快/更有效的編寫此查詢的方法。它目前需要大約20秒才能執行,但我最多希望幾秒鐘來執行此查詢(如果不少於這個)。我正在使用SQL Server 2005.我的SQL知識非常初學者,我很感激任何幫助。是否有可能用多個Inner Join重寫這個SQL查詢,以便執行速度更快?

WITH dsm_hardware_basic_cte AS 
(
SELECT TOP 100 PERCENT 
      dbo.dsm_hardware_basic.[UUID] 
      ,dbo.dsm_hardware_basic.[Name] 
      ,dbo.dsm_hardware_basic.[LastAgentExecution] 
      ,dbo.dsm_hardware_basic.[MaxUserRegistration] 
      ,REPLACE(RIGHT([MaxUserRegistration], CHARINDEX('/', REVERSE([MaxUserRegistration])) - 1),'_ADMIN','') AS [MaxUserUsername] 
      ,dbo.dsm_hardware_basic.[LastUserRegistration] 
      ,REPLACE(RIGHT([LastUserRegistration], CHARINDEX('/', REVERSE([LastUserRegistration])) - 1),'_ADMIN','') AS [LastUserUsername] 
      ,dbo.dsm_hardware_basic.[IPAddress] 
      ,dbo.dsm_hardware_basic.[HostName] 
      ,dbo.dsm_hardware_basic.[MACAddress] 
FROM  dbo.dsm_hardware_basic 
) 

SELECT  TOP 100 PERCENT 
      dsm_hardware_basic_cte.[UUID] 
      ,dsm_hardware_basic_cte.[Name] 
      ,dsm_hardware_basic_cte.[LastAgentExecution] 
      ,dsm_hardware_basic_cte.[MaxUserRegistration] 
      ,dsm_hardware_basic_cte.[LastUserRegistration] 
      ,dsm_hardware_basic_cte.[IPAddress] 
      ,dsm_hardware_basic_cte.[HostName] 
      ,dsm_hardware_basic_cte.[MACAddress] 
FROM  dsm_hardware_basic_cte 
      INNER JOIN 
      (
      SELECT [UUID] 
        ,ROW_NUMBER() OVER (PARTITION BY [Name], [MACAddress] ORDER BY [LastAgentExecution] DESC) AS [NameMACRowNum] 
      FROM  dsm_hardware_basic_cte 
      ) AS duplicate_NameMAC_filtered 
      ON duplicate_NameMAC_filtered.[UUID] = dsm_hardware_basic_cte.[UUID] 
      AND duplicate_NameMAC_filtered.[NameMACRowNum] = 1 
      INNER JOIN 
      (
      SELECT [UUID] 
        ,ROW_NUMBER() OVER (PARTITION BY [Name], [HostName] ORDER BY [LastAgentExecution] DESC) AS [NameHostNameRowNum] 
      FROM  dsm_hardware_basic_cte 
      ) AS duplicate_NameHostName_filtered 
      ON duplicate_NameHostName_filtered.[UUID] = dsm_hardware_basic_cte.[UUID] 
      AND duplicate_NameHostName_filtered.[NameHostNameRowNum] = 1 
      INNER JOIN 
      (
      SELECT [UUID] 
        ,ROW_NUMBER() OVER (PARTITION BY [HostName], [MACAddress] ORDER BY [LastAgentExecution] DESC) AS [HostNameMACRowNum] 
      FROM  dsm_hardware_basic_cte 
      ) AS duplicate_HostNameMAC_filtered 
      ON duplicate_HostNameMAC_filtered.[UUID] = dsm_hardware_basic_cte.[UUID] 
      AND duplicate_HostNameMAC_filtered.[HostNameMACRowNum] = 1 
      INNER JOIN 
      (
      SELECT [UUID] 
        ,ROW_NUMBER() OVER (PARTITION BY [HostName], [IPAddress] ORDER BY [LastAgentExecution] DESC) AS [HostNameIPAddressRowNum] 
      FROM  dsm_hardware_basic_cte 
      ) AS duplicate_HostNameIPAddress_filtered 
      ON duplicate_HostNameIPAddress_filtered.[UUID] = dsm_hardware_basic_cte.[UUID] 
      AND duplicate_HostNameIPAddress_filtered.[HostNameIPAddressRowNum] = 1 
      INNER JOIN 
      (
      SELECT [UUID] 
        ,ROW_NUMBER() OVER (PARTITION BY [Name], [MaxUserUsername] ORDER BY [LastAgentExecution] DESC) AS [NameMaxUserRowNum] 
      FROM  dsm_hardware_basic_cte 
      ) AS duplicate_NameMaxUser_filtered 
      ON duplicate_NameMaxUser_filtered.[UUID] = dsm_hardware_basic_cte.[UUID] 
      AND duplicate_NameMaxUser_filtered.[NameMaxUserRowNum] = 1 
      INNER JOIN 
      (
      SELECT [UUID] 
        ,ROW_NUMBER() OVER (PARTITION BY [Name], [LastUserUsername] ORDER BY [LastAgentExecution] DESC) AS [NameLastUserRowNum] 
      FROM  dsm_hardware_basic_cte 
      ) AS duplicate_NameLastUser_filtered 
      ON duplicate_NameLastUser_filtered.[UUID] = dsm_hardware_basic_cte.[UUID] 
      AND duplicate_NameLastUser_filtered.[NameLastUserRowNum] = 1 
+1

由於您使用的是SQL Server,因此第一步是查看SSMS中的實際查詢計劃。它是否「建議」任何指數?查詢計劃顯示大部分時間在哪裏? – 2012-12-04 04:24:24

+0

該查詢執行的頻率如何? –

+0

我很不理解查詢計劃。我一直在繼續自學自己的事情。我今天早些時候確實捕獲了一個查詢的執行計劃,我認爲這需要大約13或14秒。你可以在這裏找到它:http://www.mediafire.com/file/lvfs7tg2iwnp2a7/execution_plan.sqlplan – user1367200

回答

0

我不知道你的需求是什麼,但我會嘗試重新編寫查詢這樣:

WITH dsm_hardware_basic_cte AS (
SELECT 
    d.[UUID] 
    ,d.[Name] 
    ,d.[LastAgentExecution] 
    ,d.[MaxUserRegistration] 
    ,REPLACE(RIGHT([MaxUserRegistration], CHARINDEX('/', REVERSE([MaxUserRegistration])) - 1),'_ADMIN','') AS [MaxUserUsername] 
    ,d.[LastUserRegistration] 
    ,REPLACE(RIGHT([LastUserRegistration], CHARINDEX('/', REVERSE([LastUserRegistration])) - 1),'_ADMIN','') AS [LastUserUsername] 
    ,d.[IPAddress] 
    ,d.[HostName] 
    ,d.[MACAddress] 
    ,ROW_NUMBER() OVER (PARTITION BY [Name], [MACAddress] ORDER BY [LastAgentExecution] DESC) AS [NameMACRowNum] 
    ,ROW_NUMBER() OVER (PARTITION BY [Name], [HostName] ORDER BY [LastAgentExecution] DESC) AS [NameHostNameRowNum] 
    ,ROW_NUMBER() OVER (PARTITION BY [HostName], [MACAddress] ORDER BY [LastAgentExecution] DESC) AS [HostNameMACRowNum] 
    ,ROW_NUMBER() OVER (PARTITION BY [HostName], [IPAddress] ORDER BY [LastAgentExecution] DESC) AS [HostNameIPAddressRowNum] 
    ,ROW_NUMBER() OVER (PARTITION BY [Name], [MaxUserUsername] ORDER BY [LastAgentExecution] DESC) AS [NameMaxUserRowNum] 
    ,ROW_NUMBER() OVER (PARTITION BY [Name], [LastUserUsername] ORDER BY [LastAgentExecution] DESC) AS [NameLastUserRowNum] 
FROM  dbo.dsm_hardware_basic as d 
) 

SELECT 
    c.[UUID] 
    ,c.[Name] 
    ,c.[LastAgentExecution] 
    ,c.[MaxUserRegistration] 
    ,c.[LastUserRegistration] 
    ,c.[IPAddress] 
    ,c.[HostName] 
    ,c.[MACAddress] 
FROM  dsm_hardware_basic_cte as c 
WHERE 
    c.[NameMACRowNum] = 1 
    or c.[NameHostNameRowNum] = 1 
    or c.[HostNameMACRowNum] = 1 
    or [HostNameIPAddressRowNum] = 1 
    or [NameMaxUserRowNum] = 1 
    or [NameLastUserRowNum] = 1 

我認爲您的查詢,我的是邏輯上等同。優化器可能足夠聰明,可以將您的查詢減少到我的查詢,但可以旋轉並查看!有兩點要注意:

  1. 我用表的別名,以使其更有點可讀的(在我看來)
  2. 我從你選擇刪除「頂百分之百」的條款。這不是必需的;這通常是人們採取的一種黑客行爲,以便他們可以通過視圖來完成「有序視圖」。不要這樣做。 :)
+0

對代碼的一些修改與我的完整查詢的工作方式有關,我能夠得到這個工作,它有點快。謝謝! – user1367200

0

根據您的查詢計劃按LastAgentExecution排序需要19%的時間。首先在此列上創建一個索引。但是,如果我是你,我會改變使用「ROW_NUMBER()OVER(PARTITION BY [Name],[MACAddress] ORDER BY [LastAgentExecution] DESC)」類型的語法的習慣,因爲它似乎並不適用是非常有效的

+0

如何在列上創建索引?我也看到有人提到索引其他問題/答案中的列,我也是想知道是否可以給一個簡單的解釋,爲什麼索引一列使查詢運行更快? – user1367200

+0

此外,我不知道除了視圖中的ROW_NUMBER函數,還有什麼可以使用的。我用臨時表寫了一個解決方案,但後來得知我無法在View中使用臨時表。 – user1367200

+0

實際上,在這裏找到了一個很好的索引解釋:http://odetocode.com/articles/237.aspx – user1367200

0

取而代之的內部連接,嘗試用「存在」替換它們子句這樣

WHERE  EXISTS   
      ((SELECT [UUID],[NameMACRowNum] 
       FROM 
         (SELECT [UUID] 
           ,ROW_NUMBER() OVER (PARTITION BY [Name], [MACAddress] ORDER BY [LastAgentExecution] DESC) AS [NameMACRowNum] 
         FROM  dsm_hardware_basic_cte) AS duplicate_NameMAC_filtered 
      WHERE duplicate_NameMAC_filtered.[UUID] = dsm_hardware_basic_cte.[UUID] 
      AND duplicate_NameMAC_filtered.[NameMACRowNum] = 1) 

不知道它應該是存在與否存在,但它會是簡單,一旦剩下的改變工作中。