2016-04-22 44 views
0
 SELECT 
s.ColID1 
,s.ColIdentification2 
,s.StatusColumn 
,(SELECT 
    MAX(pd.DateColumn) 
    FROM DocumentTable pd 
    WHERE pd.IsPresent = 1 
    AND pd.ColIdentification2 = s.ColIdentification2 
    AND pd.TypeofFile = 'TextFiles') 
AS maxDate 
,(SELECT TOP 1 
    u.Title 
    FROM DocumentTable pd 
    LEFT OUTER JOIN [User] u 
    ON u.UserId = pd.UserId 
    WHERE pd.IsPresent = 1 
    AND pd.ColIdentification2 = s.ColIdentification2 
    AND pd.TypeofFile = 'Text Files' 
    ORDER BY pd.DateColumn DESC) 
AS Name1 
,(SELECT TOP 1 
    pd.DocumentType 
    FROM DocumentTable pd 
    WHERE pd.IsPresent = 1 
    AND pd.ColIdentification2 = s.ColIdentification2 
    AND pd.TypeofFile = 'Text Files' 
    ORDER BY pd.DateColumn DESC) 
, (SELECT TOP 1 
    pd.TypeofFile 
    FROM DocumentTable pd 
    WHERE pd.IsPresent = 1 
    AND pd.ColIdentification2 = s.ColIdentification2 
    AND pd.TypeofFile = 'Text Files' 
    ORDER BY pd.DateColumn DESC) 
,(SELECT TOP 1 
    pd.Region 
    FROM DocumentTable pd 
    WHERE pd.IsPresent = 1 
    AND pd.ColIdentification2 = s.ColIdentification2 
    AND pd.TypeofFile = 'Text Files' 
    ORDER BY pd.DateColumn DESC) 
,(SELECT TOP 1 
    pd.Agency 
    FROM DocumentTable pd 
    WHERE pd.IsPresent = 1 
    AND pd.ColIdentification2 = s.ColIdentification2 
    AND pd.TypeofFile = 'Text Files' 
    ORDER BY pd.DateColumn DESC) 
FROM Service s (NOLOCK) 
--left outer join DocumentTable pd1 (NOLOCK) 
--on pd1.ColIdentification2 = s.ColIdentification2 
WHERE s.IsPresent = 1 
--AND pd1.ColIdentification2 = s.ColIdentification2 
AND s.StatusColumn IN ('Val1', 'Val3') 
AND NOT EXISTS (SELECT 
    pd.DocumentTableId 
FROM DocumentTable pd 
WHERE pd.IsPresent = 1 
AND pd.ColIdentification2 = s.ColIdentification2 
AND pd.TypeofFile IN ('DC1', 'DC2')) 
AND NOT EXISTS (SELECT 
    utds.ID 
FROM utds 
WHERE utds.Service_x0020_ID1_Id = s.ColID1 
AND utds.Type IN ('DC1', 'DC2')) 
ORDER BY s.ColID1 

我想優化這個SQL。由於許多子查詢,它需要很長時間。這個查詢花費了超過10分鐘的時間,我正在努力改進它。無論如何避免子查詢。我曾嘗試在表格之間使用左外連接,但我認爲由於文檔表格中ColID1的數據重複,我沒有得到正確的數據處理SQL問題的時間量

回答

0

很難調整沒有統計信息的查詢執行計劃和嘗試錯誤。

我認爲,你可以通過轉換子查詢加入更好。所以儘量消除子查詢。

您可以使用下面的查詢

SELECT s.ColID1 
    , s.ColIdentification2 
    , s.StatusColumn 
    , pd.DocumentType, pd.TypeofFile, pd.Region, pd.TypeofFile, Region 
from [Service] s 
    outer apply (select top 1 DocumentType, TypeofFile, Region, TypeofFile, Region 
       from DocumentTable 
       where IsPresent = 1 and TypeofFile = 'Text Files' 
        and ColIdentification2 = s.ColIdentification2 
       order by DateColumn desc) pd 

聯接如果它的幫助下,嘗試使用同樣的辦法擺脫4。

此外,請確保您在兩個表中的ColIdentification2字段都有索引。

+0

謝謝,我們有colIdentification2字段指定爲主鍵與聚集索引就可以了。 。我將嘗試在文檔表上的此列上創建非聚集索引。我同意你的觀點,我會用統計數據來建立。結果集是不同的,當我上次嘗試離開外連接時。 ,但我會嘗試與你的建議 – Nate

0

閃爍對確保您的公共列(如ColIdentification2)進行索引是非常好的一點。我也想驗證你有一個索引DocumentTable.DateColumn

無論如何...

事情是與你的查詢有一個小忙,讓我們重新格式化一點,並採取了「大畫面」看看吧:

SELECT 
s.ColID1 
,s.ColIdentification2 
,s.StatusColumn 
,(SELECT TOP 1 u.Title   FROM DocumentTable pd LEFT OUTER JOIN [User] u ON u.UserId = pd.UserId WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC) AS Name1 
,(SELECT MAX(pd.DateColumn) FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'TextFiles') AS maxDate 
,(SELECT TOP 1 pd.DocumentType FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC) 
,(SELECT TOP 1 pd.TypeofFile FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC) 
,(SELECT TOP 1 pd.Region  FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC) 
,(SELECT TOP 1 pd.Agency  FROM DocumentTable pd WHERE pd.IsPresent = 1 AND pd.ColIdentification2 = s.ColIdentification2 AND pd.TypeofFile = 'Text Files' ORDER BY pd.DateColumn DESC) 
FROM Service s (NOLOCK) 
WHERE s.IsPresent = 1 
    AND s.StatusColumn IN ('Val1', 'Val3') 
AND NOT EXISTS (SELECT utds.ID FROM utds WHERE utds.Service_x0020_ID1_Id = s.ColID1 AND utds.Type IN ('DC1', 'DC2')) 
ORDER BY s.ColID1 

因此,下面列看起來他們都將最終從同一行進來DocumentTable PD:

pd.DateColumn 
pd.DocumentType 
pd.TypeofFile 
pd.Region  
pd.Agency  

note: For pd.DateColumn, your use of max(pd.DateColumn) has the result same 
     the sub-select style you're using in the other pd.* columns: 
     SELECT TOP 1 pd.DateColumn from ...BLAH BLAH BLAH... order by pd.DateColumn DESC 
Also your pd.DateColumn's subselect has a where clause checking for 'TextFiles' 
instead of 'Text Files' that the other pd.* columns are using, should they all 
be 'Text Files'? (Note the extra embedded space in 'TextFiles' vs 'Text Files') 

相反的運行,對於PD相同的子查詢邏輯5不同的時間, 讓我們將它推入左連接,並嘗試做一次...

這是完全未經測試的代碼順便說一句,我希望它的工作原理:-)

SELECT 
    s.ColID1 
, s.ColIdentification2 
, s.StatusColumn 
/* If we get a stable row for PD pulling u.Title from User becomes easier... */ 
, (select u.Title from User u where on u.UserId = pd.UserId) as userTitle 
, pd.DateColumn 
, pd.DocumentType 
, pd.TypeofFile 
, pd.Region 
, pd.Agency 
FROM Service s (NOLOCK) 
left join DocumentTable pd 
     on pd.IsPresent = 1 
     and pd.ColIdentification2 = s.ColIdentification2 
     and pd.TypeofFile = 'Text Files' 
     /* This next condition avoids having to do the ORDER BY pd.DateColumnDESC 
     * The idea is for sqlserver to consider all potential matching pd records 
     * but ignore any that aren't the largest date. 
     */ 
     and not exists(select 1 from DocumentTable pd2 
         where pd2.IsPresent   = pd1.IsPresent 
         and pd2.ColIdentification2 = pd.ColIdentification2 
         and pd2.TypeofFile   = pd.TypeofFile 
         and pd2.DateColumn   > pd.DateColumn) 
     /* may as well add the "no DC1 & DC2" clause here... */ 
     and not exists (select 1 FROM DocumentTable pd3 
         where pd2.IsPresent   = pd1.IsPresent 
         and pd2.ColIdentification2 = pd.ColIdentification2 
         and pd2.TypeofFile   in ('DC1', 'DC2') 
         and pd2.DateColumn   > pd.DateColumn) 
WHERE s.IsPresent = 1 
    AND s.StatusColumn IN ('Val1', 'Val3') 
    AND NOT EXISTS (
    SELECT 1 FROM utds 
    WHERE utds.Service_x0020_ID1_Id = s.ColID1 
     AND utds.Type     IN ('DC1', 'DC2')) 
ORDER BY s.ColID1 

幾收盤想法:

我喜歡縮進複雜的WHERE子句,使我更容易圍繞邏輯圍繞我的頭 。

要考慮查詢的行爲,什麼主表中的「s」做工作:

select * FROM Service s 

對於每個記錄,我們從得到的',我們想找到(最多)一個合適的'pd'記錄。

這裏的「合適」是指像pd.ColIdentification2 = s.colIdentification等常用列。

細微部分,是這樣的:

這裏
AND NOT EXISTS (SELECT 1 FROM DocumentTable PD2 ....WHERE PD2.DATECOLUMN > PD.DATECOLUMN). 

一個加速的好處是,我們不真正關心ORDER BY,我們只是想確保我們在PD的最新行(我們使用不存在用pd2將任何較舊的pd記錄踢出運行的東西)。

究其原因,我認爲這可能是比ORDER BY快是SQL Server引擎並不需要做一個索引遍歷處理上ORDER BY DATECOLUMN DESCTOP 1「;一個聰明的優化也許可以明白這一點,只是跳到DATECOLUMN上的最大索引......但這可能很大,所以我預計這種方法總體上會更快)

您會注意到一個類似的技巧,即阻止任何具有DC1或DC2的PD記錄。

在最初的查詢中,我讀到該部分(最後,在主WHERE子句中)意思是:「即使給定的PD記錄在各方面都是完美的(完美如果與'DC1'或'DC2'存在任何PD/S匹配(無論日期如何),那麼我們希望刪除所有PD/S記錄。