0

我有一段存儲過程,我寫回來幫助生成將用於與外部資源共享數據的XML文件。基本上,最終用戶在名爲DataSharing的表內轉儲數據,然後當我們執行查詢時,它將返回一個XML文檔,其中包含DataSharing中指定的必需字段。現在這個過程確實很好,但速度非常慢。當我通過SSMS運行它並設置'顯示實際執行計劃'時,94%的查詢花費在索引假脫機(eager spool)上。經過研究,它看起來像我應該重做查詢以更好地執行。SQL索引假脫機(eager spool)加速查詢

由於數據列我永遠不知道他們是什麼,我必須做獨特的樞軸爲了生成我的數據。

下面是該過程:

CREATE PROCEDURE [dbo].[sp_HPSDDataSharing] 
    -- Add the parameters for the stored procedure here 
    @fileName varchar(MAX), @StartDate datetime, @EndDate datetime 
AS 
BEGIN 
    -- SET NOCOUNT ON added to prevent extra result sets from 
    -- interfering with SELECT statements. 
    SET NOCOUNT ON; 

    DECLARE @sqlCommand varchar(MAX), @listStr VARCHAR(MAX) 
    SELECT @listStr = 
     COALESCE(@listStr +',' ,'') + '[' + [ColumnName] + ']' 
    FROM [FCPP_HPSD].[dbo].[DataSharing] 
    WHERE FileName = @fileName 
    DECLARE @Result XML 
    SET @sqlCommand = 'Select * From (SELECT 
     [DatapointDate] 
     ,dp.ColumnName 
     ,[DataPointValue] 
    FROM [FCPP_HPSD].[dbo].[vw_DataCollection] DC 
    JOIN [FCPP_HPSD].[dbo].[Datasharing] dp 
    ON DC.DataPointID = DP.DatapointID 
    WHERE 
    [DatapointDate] >= ''' + CONVERT(varchar(MAX), @StartDate) + ''' 
    and [DatapointDate] < ''' + CONVERT(varchar(MAX), @EndDate) + ''' 
    and dc.DataPointID in (SELECT [DatapointID] FROM [FCPP_HPSD].[dbo].[DataSharing] Where FileName = ''' + @fileName + ''') 
) AS source 
    PIVOT 
    (
     SUM(DataPointValue) 
     FOR ColumnName IN ('+ @listStr +') 
    ) as pvt 
    ORDER BY DatapointDate 
    FOR XML Path(''' + 'DataRow' + '''), ROOT;' 

    Print @sqlCommand 

    EXEC (@sqlCommand) 

END 



GO 

完全執行的查詢看起來是這樣的:

SELECT * 
FROM (SELECT [datapointdate], 
       dp.columnname, 
       [datapointvalue] 
     FROM [FCPP_HPSD].[dbo].[vw_datacollection] DC 
       JOIN [FCPP_HPSD].[dbo].[datasharing] dp 
       ON DC.datapointid = DP.datapointid 
     WHERE [datapointdate] >= 'Jul 15 2013 12:00AM' 
       AND [datapointdate] < 'Jul 22 2013 12:00AM' 
       AND dc.datapointid IN (SELECT [datapointid] 
             FROM [FCPP_HPSD].[dbo].[datasharing] 
             WHERE filename = 'fdrD3')) AS source 
     PIVOT (Sum(datapointvalue) 
      FOR columnname IN ([fdrD3_kWh_A], 
           [fdrD3_kWh_B], 
           [fdrD3_kWh_C], 
           [fdrD3_kWh], 
           [fdrD3_I_A], 
           [fdrD3_I_B], 
           [fdrD3_I_C], 
           [fdrD3_I_N], 
           [fdrD3_V_A], 
           [fdrD3_V_B], 
           [fdrD3_V_C], 
           [fdrD3_V_A-B], 
           [fdrD3_V_B-C], 
           [fdrD3_kV_C-A], 
           [fdrD3_kW], 
           [fdrD3_kVA], 
           [fdrD3_kVAr], 
           [fdrD3_kW_A], 
           [fdrD3_kW_B], 
           [fdrD3_kW_C], 
           [fdrD3_kVA_A], 
           [fdrD3_kVA_B], 
           [fdrD3_kVA_C], 
           [fdrD3_kVAr_A], 
           [fdrD3_kVAr_B], 
           [fdrD3_kVAr_C], 
           [fdrD3_F], 
           [fdrD3_Iang_A], 
           [fdrD3_Iang_B], 
           [fdrD3_Iang_C], 
           [fdrD3_Iang_N], 
           [fdrD3_Vang_A], 
           [fdrD3_Vang_B], 
           [fdrD3_Vang_C], 
           [fdrD3_Vang_A-B], 
           [fdrD3_Vang_B-C], 
           [fdrD3_Vang_C-A], 
           [fdrD3_PF_A], 
           [fdrD3_PF_B], 
           [fdrD3_PF_C], 
           [fdrD3_PF], 
           [fdrD3_Pst_V_A], 
           [fdrD3_Pst_V_B], 
           [fdrD3_Pst_V_C], 
           [fdrD3_Plt_V_A], 
           [fdrD3_Plt_V_B], 
           [fdrD3_Plt_V_C], 
           [fdrD3_Vdev_A], 
           [fdrD3_Vdev_B], 
           [fdrD3_Vdev_C], 
           [fdrD3_Fdev], 
           [fdrD3_THD_I_A], 
           [fdrD3_THD_I_B], 
           [fdrD3_THD_I_C], 
           [fdrD3_THD_I_N], 
           [fdrD3_THD_V_A], 
           [fdrD3_THD_V_B], 
           [fdrD3_THD_V_C])) AS pvt 
ORDER BY datapointdate 
FOR xml path('DataRow'), root; 

所以目前的程序目前需要35-65秒才能運行。當我處理超時時,我真的需要看到有關加快這個過程。如果任何人都可以幫我解決我能做的事情,以幫助加快速度,擺脫索引假脫機(假脫機)的這麼多時間,我將不勝感激。

編輯1:

我加了一個SQL Fiddle所以希望這有助於。

+0

那麼一件小事不會將開始日期和結束日期轉換爲字符串並返回日期。 – Hogan

+0

沒有一個很好的指示說明表如何看起來像什麼,他們有什麼指數(或沒有),他們是什麼數量,如果有外鍵關係等等......很少有人可以建議......如果可能的話,轉到SQLFiddle,確保構建所有相關的表和它們的索引等,然後在這裏複製代碼,這樣至少有一個機會你會得到一個有用的答案。 – deroby

+0

@霍根我曾想過這件事。但是當我執行Pivot時,只需要大約6秒的時間來執行,這是非常好的,因爲我們有547k行數據。 –

回答

0

這裏是你的樞軸展開 - 看看這是否運行得更快(我敢打賭,這將主要是因爲CTE的優化),如果它確實那麼你可以重新編寫你的生成器來創建一個查詢,看起來像這個:

WITH datelist 
(
    SELECT datapointid, filename, datapointvalue 
    FROM [FCPP_HPSD].[dbo].[datasharing] 
    WHERE datapointdate >= @StartDate AND datapointdate < @EndDate AND filename = @filename 
) 
SELECT 
    SUM(j1.datepointvalue) as sum_fdrD3_kWh_A 
    SUM(j2.datepointvalue) as sum_fdrD3_kWh_B 
    SUM(j3.datepointvalue) as sum_fdrD3_kWh_C 
    SUM(j4.datepointvalue) as sum_fdrD3_kWh 
    SUM(j5.datepointvalue) as sum_fdrD3_I_A 
    SUM(j6.datepointvalue) as sum_fdrD3_I_B 
    SUM(j7.datepointvalue) as sum_fdrD3_I_C 
    SUM(j8.datepointvalue) as sum_fdrD3_I_N 
    SUM(j9.datepointvalue) as sum_fdrD3_V_A 
    SUM(j10.datepointvalue) as sum_fdrD3_V_B 
    SUM(j12.datepointvalue) as sum_fdrD3_V_C 
    SUM(j13.datepointvalue) as sum_fdrD3_V_A_B 
    SUM(j14.datepointvalue) as sum_fdrD3_V_B_C 
    SUM(j15.datepointvalue) as sum_fdrD3_kV_C_A 
    SUM(j16.datepointvalue) as sum_fdrD3_kW 
    SUM(j17.datepointvalue) as sum_fdrD3_kVA 
    SUM(j18.datepointvalue) as sum_fdrD3_kVAr 
    SUM(j19.datepointvalue) as sum_fdrD3_kW_A 
    SUM(j20.datepointvalue) as sum_fdrD3_kW_B 
    SUM(j21.datepointvalue) as sum_fdrD3_kW_C 
    SUM(j22.datepointvalue) as sum_fdrD3_kVA_A 
    SUM(j23.datepointvalue) as sum_fdrD3_kVA_B 
    SUM(j24.datepointvalue) as sum_fdrD3_kVA_C 
    SUM(j25.datepointvalue) as sum_fdrD3_kVAr_A 
    SUM(j26.datepointvalue) as sum_fdrD3_kVAr_B 
    SUM(j27.datepointvalue) as sum_fdrD3_kVAr_C 
    SUM(j28.datepointvalue) as sum_fdrD3_F 
    SUM(j29.datepointvalue) as sum_fdrD3_Iang_A 
    SUM(j20.datepointvalue) as sum_fdrD3_Iang_B 
    SUM(j31.datepointvalue) as sum_fdrD3_Iang_C 
    SUM(j32.datepointvalue) as sum_fdrD3_Iang_N 
    SUM(j33.datepointvalue) as sum_fdrD3_Vang_A 
    SUM(j34.datepointvalue) as sum_fdrD3_Vang_B 
    SUM(j35.datepointvalue) as sum_fdrD3_Vang_C 
    SUM(j36.datepointvalue) as sum_fdrD3_Vang_A_B 
    SUM(j37.datepointvalue) as sum_fdrD3_Vang_B_C 
    SUM(j38.datepointvalue) as sum_fdrD3_Vang_C_A 
    SUM(j39.datepointvalue) as sum_fdrD3_PF_A 
    SUM(j40.datepointvalue) as sum_fdrD3_PF_B 
    SUM(j41.datepointvalue) as sum_fdrD3_PF_C 
    SUM(j42.datepointvalue) as sum_fdrD3_PF 
    SUM(j43.datepointvalue) as sum_fdrD3_Pst_V_A 
    SUM(j44.datepointvalue) as sum_fdrD3_Pst_V_B 
    SUM(j45.datepointvalue) as sum_fdrD3_Pst_V_C 
    SUM(j46.datepointvalue) as sum_fdrD3_Plt_V_A 
    SUM(j47.datepointvalue) as sum_fdrD3_Plt_V_B 
    SUM(j48.datepointvalue) as sum_fdrD3_Plt_V_C 
    SUM(j49.datepointvalue) as sum_fdrD3_Vdev_A 
    SUM(j50.datepointvalue) as sum_fdrD3_Vdev_B 
    SUM(j51.datepointvalue) as sum_fdrD3_Vdev_C 
    SUM(j52.datepointvalue) as sum_fdrD3_Fdev 
    SUM(j53.datepointvalue) as sum_fdrD3_THD_I_A 
    SUM(j54.datepointvalue) as sum_fdrD3_THD_I_B 
    SUM(j55.datepointvalue) as sum_fdrD3_THD_I_C 
    SUM(j56.datepointvalue) as sum_fdrD3_THD_I_N 
    SUM(j57.datepointvalue) as sum_fdrD3_THD_V_A 
    SUM(j58.datepointvalue) as sum_fdrD3_THD_V_B 
    SUM(j59.datepointvalue) as sum_fdrD3_THD_V_C 
FROM [FCPP_HPSD].[dbo].[vw_datacollection] DC 
LEFT JOIN datelist j1 ON DC.datapointid = j1.datapointid AND j1.columnname = 'fdrD3_kWh_A' 
LEFT JOIN datelist j2 ON DC.datapointid = j2.datapointid AND j2.columnname = 'fdrD3_kWh_B' 
LEFT JOIN datelist j3 ON DC.datapointid = j3.datapointid AND j3.columnname = 'fdrD3_kWh_C' 
LEFT JOIN datelist j4 ON DC.datapointid = j4.datapointid AND j4.columnname = 'fdrD3_kWh' 
LEFT JOIN datelist j5 ON DC.datapointid = j5.datapointid AND j5.columnname = 'fdrD3_I_A' 
LEFT JOIN datelist j6 ON DC.datapointid = j6.datapointid AND j6.columnname = 'fdrD3_I_B' 
LEFT JOIN datelist j7 ON DC.datapointid = j7.datapointid AND j7.columnname = 'fdrD3_I_C' 
LEFT JOIN datelist j8 ON DC.datapointid = j8.datapointid AND j8.columnname = 'fdrD3_I_N' 
LEFT JOIN datelist j9 ON DC.datapointid = j9.datapointid AND j9.columnname = 'fdrD3_V_A' 
LEFT JOIN datelist j10 ON DC.datapointid = j10.datapointid AND j10.columnname = 'fdrD3_V_B' 
LEFT JOIN datelist j12 ON DC.datapointid = j12.datapointid AND j12.columnname = 'fdrD3_V_C' 
LEFT JOIN datelist j13 ON DC.datapointid = j13.datapointid AND j13.columnname = 'fdrD3_V_A-B' 
LEFT JOIN datelist j14 ON DC.datapointid = j14.datapointid AND j14.columnname = 'fdrD3_V_B-C' 
LEFT JOIN datelist j15 ON DC.datapointid = j15.datapointid AND j15.columnname = 'fdrD3_kV_C-A' 
LEFT JOIN datelist j16 ON DC.datapointid = j16.datapointid AND j16.columnname = 'fdrD3_kW' 
LEFT JOIN datelist j17 ON DC.datapointid = j17.datapointid AND j17.columnname = 'fdrD3_kVA' 
LEFT JOIN datelist j18 ON DC.datapointid = j18.datapointid AND j18.columnname = 'fdrD3_kVAr' 
LEFT JOIN datelist j19 ON DC.datapointid = j19.datapointid AND j19.columnname = 'fdrD3_kW_A' 
LEFT JOIN datelist j20 ON DC.datapointid = j20.datapointid AND j20.columnname = 'fdrD3_kW_B' 
LEFT JOIN datelist j21 ON DC.datapointid = j21.datapointid AND j21.columnname = 'fdrD3_kW_C' 
LEFT JOIN datelist j22 ON DC.datapointid = j22.datapointid AND j22.columnname = 'fdrD3_kVA_A' 
LEFT JOIN datelist j23 ON DC.datapointid = j23.datapointid AND j23.columnname = 'fdrD3_kVA_B' 
LEFT JOIN datelist j24 ON DC.datapointid = j24.datapointid AND j24.columnname = 'fdrD3_kVA_C' 
LEFT JOIN datelist j25 ON DC.datapointid = j25.datapointid AND j25.columnname = 'fdrD3_kVAr_A' 
LEFT JOIN datelist j26 ON DC.datapointid = j26.datapointid AND j26.columnname = 'fdrD3_kVAr_B' 
LEFT JOIN datelist j27 ON DC.datapointid = j27.datapointid AND j27.columnname = 'fdrD3_kVAr_C' 
LEFT JOIN datelist j28 ON DC.datapointid = j28.datapointid AND j28.columnname = 'fdrD3_F' 
LEFT JOIN datelist j29 ON DC.datapointid = j29.datapointid AND j29.columnname = 'fdrD3_Iang_A' 
LEFT JOIN datelist j20 ON DC.datapointid = j20.datapointid AND j20.columnname = 'fdrD3_Iang_B' 
LEFT JOIN datelist j31 ON DC.datapointid = j31.datapointid AND j31.columnname = 'fdrD3_Iang_C' 
LEFT JOIN datelist j32 ON DC.datapointid = j32.datapointid AND j32.columnname = 'fdrD3_Iang_N' 
LEFT JOIN datelist j33 ON DC.datapointid = j33.datapointid AND j33.columnname = 'fdrD3_Vang_A' 
LEFT JOIN datelist j34 ON DC.datapointid = j34.datapointid AND j34.columnname = 'fdrD3_Vang_B' 
LEFT JOIN datelist j35 ON DC.datapointid = j35.datapointid AND j35.columnname = 'fdrD3_Vang_C' 
LEFT JOIN datelist j36 ON DC.datapointid = j36.datapointid AND j36.columnname = 'fdrD3_Vang_A-B' 
LEFT JOIN datelist j37 ON DC.datapointid = j37.datapointid AND j37.columnname = 'fdrD3_Vang_B-C' 
LEFT JOIN datelist j38 ON DC.datapointid = j38.datapointid AND j38.columnname = 'fdrD3_Vang_C-A' 
LEFT JOIN datelist j39 ON DC.datapointid = j39.datapointid AND j39.columnname = 'fdrD3_PF_A' 
LEFT JOIN datelist j40 ON DC.datapointid = j40.datapointid AND j40.columnname = 'fdrD3_PF_B' 
LEFT JOIN datelist j41 ON DC.datapointid = j41.datapointid AND j41.columnname = 'fdrD3_PF_C' 
LEFT JOIN datelist j42 ON DC.datapointid = j42.datapointid AND j42.columnname = 'fdrD3_PF' 
LEFT JOIN datelist j43 ON DC.datapointid = j43.datapointid AND j43.columnname = 'fdrD3_Pst_V_A' 
LEFT JOIN datelist j44 ON DC.datapointid = j44.datapointid AND j44.columnname = 'fdrD3_Pst_V_B' 
LEFT JOIN datelist j45 ON DC.datapointid = j45.datapointid AND j45.columnname = 'fdrD3_Pst_V_C' 
LEFT JOIN datelist j46 ON DC.datapointid = j46.datapointid AND j46.columnname = 'fdrD3_Plt_V_A' 
LEFT JOIN datelist j47 ON DC.datapointid = j47.datapointid AND j47.columnname = 'fdrD3_Plt_V_B' 
LEFT JOIN datelist j48 ON DC.datapointid = j48.datapointid AND j48.columnname = 'fdrD3_Plt_V_C' 
LEFT JOIN datelist j49 ON DC.datapointid = j49.datapointid AND j49.columnname = 'fdrD3_Vdev_A' 
LEFT JOIN datelist j50 ON DC.datapointid = j50.datapointid AND j50.columnname = 'fdrD3_Vdev_B' 
LEFT JOIN datelist j51 ON DC.datapointid = j51.datapointid AND j51.columnname = 'fdrD3_Vdev_C' 
LEFT JOIN datelist j52 ON DC.datapointid = j52.datapointid AND j52.columnname = 'fdrD3_Fdev' 
LEFT JOIN datelist j53 ON DC.datapointid = j53.datapointid AND j53.columnname = 'fdrD3_THD_I_A' 
LEFT JOIN datelist j54 ON DC.datapointid = j54.datapointid AND j54.columnname = 'fdrD3_THD_I_B' 
LEFT JOIN datelist j55 ON DC.datapointid = j55.datapointid AND j55.columnname = 'fdrD3_THD_I_C' 
LEFT JOIN datelist j56 ON DC.datapointid = j56.datapointid AND j56.columnname = 'fdrD3_THD_I_N' 
LEFT JOIN datelist j57 ON DC.datapointid = j57.datapointid AND j57.columnname = 'fdrD3_THD_V_A' 
LEFT JOIN datelist j58 ON DC.datapointid = j58.datapointid AND j58.columnname = 'fdrD3_THD_V_B' 
LEFT JOIN datelist j59 ON DC.datapointid = j59.datapointid AND j59.columnname = 'fdrD3_THD_V_C' 
+0

不,它不會因爲列名將永遠不同 –

+0

@kevinc - 請參閱新查詢 – Hogan

+0

與此問題是,由於DataSharing Table允許用戶指定哪些列 –

0

我已經刪除了子查詢,我希望這會加快執行並且不會產生不正確的結果。

SELECT * 
    FROM (SELECT [datapointdate], 
        dp.columnname, 
        [datapointvalue] 
      FROM [FCPP_HPSD].[dbo].[vw_datacollection] DC 
        JOIN [FCPP_HPSD].[dbo].[datasharing] dp 
        ON DC.datapointid = DP.datapointid 
      WHERE [datapointdate] >= 'Jul 15 2013 12:00AM' 
        AND [datapointdate] < 'Jul 22 2013 12:00AM' 
        AND dc.datapointid IN (SELECT [datapointid] 
            FROM [FCPP_HPSD].[dbo].[datasharing] 
            WHERE filename = 'fdrD3')) AS source 
      PIVOT (Sum(datapointvalue) 
       FOR columnname IN (select distinct dp.columnname 
          from [FCPP_HPSD].[dbo].[datasharing] dp) AS pvt 
    ORDER BY datapointdate 
    FOR xml path('DataRow'), root; 

編輯 可能是在那裏將需要條款,如果你想選擇的數據。它在Oracle中工作。 我把子查詢放回原處,並在Pivot中增加了另一個查詢,只是爲了簡化代碼並確保將來還能照顧到任何新數據。

+0

這使得它運行得更慢 –

+0

這很難相信..我一直觀察到,子查詢的聯接和where子句比較慢。請考慮任何網絡相關問題。 – Tauseef

+0

是的,我要求我們的網絡工程師驗證網絡問題 –

0

從SqlFiddle我建議在DataSharing(FileName,DataPointID)上加一個額外的索引。但是,從您的評論看來,實際查詢似乎只需要6秒(這是否包括將所有547k記錄發送到SSMS所需的時間?),這樣剩下的時間會被PIVOT吸收並轉換爲XML?

  • 你可以給一個時間做多少時間只需做SELECT然後泵入一個臨時表?
  • 你可以給出一個時間,只需要將PIVOT從所述temp-table INTO到另一個temp-table需要多長時間?
  • 你可以給出一個計時器,當存儲到一個變量中時,從最後一個temp-table到xml的轉換需要多長時間?

代碼明智我不是子查詢的粉絲,但它似乎更安全的使用WHERE EXISTS()建設比直接JOIN imho。再次,優化器也經常意識到這一點,已經爲我們做了這些。因此下面的查詢計劃看起來可能與原來的查詢計劃完全相同。

Select * 
    From (SELECT [DatapointDate] 
       ,dp.ColumnName 
       ,[DataPointValue] 
      FROM [DataCollection] DC 
      JOIN [Datasharing] dp 
      ON DC.DataPointID = DP.DatapointID 
      WHERE [DatapointDate] >= 'Jul 15 2013 12:00AM' 
      AND [DatapointDate] < 'Jul 22 2013 12:00AM' 
      AND EXISTS (SELECT * 
          FROM [DataSharing] ds 
          WHERE ds.[FileName] = 'fdrD3' 
          AND dc.DataPointID = ds.[DatapointID]) 
     ) AS source 
PIVOT 
    (
     SUM(DataPointValue) 
     FOR ColumnName IN ([fdrD3_kWh_A],[fdrD3_kWh_B],[fdrD3_kWh_C],[fdrD3_kWh],[fdrD3_I_A],[fdrD3_I_B],[fdrD3_I_C],[fdrD3_I_N],[fdrD3_V_A],[fdrD3_V_B],[fdrD3_V_C],[fdrD3_V_A-B],[fdrD3_V_B-C],[fdrD3_kV_C-A],[fdrD3_kW],[fdrD3_kVA],[fdrD3_kVAr],[fdrD3_kW_A],[fdrD3_kW_B],[fdrD3_kW_C],[fdrD3_kVA_A],[fdrD3_kVA_B],[fdrD3_kVA_C],[fdrD3_kVAr_A],[fdrD3_kVAr_B],[fdrD3_kVAr_C],[fdrD3_F],[fdrD3_Iang_A],[fdrD3_Iang_B],[fdrD3_Iang_C],[fdrD3_Iang_N],[fdrD3_Vang_A],[fdrD3_Vang_B],[fdrD3_Vang_C],[fdrD3_Vang_A-B],[fdrD3_Vang_B-C],[fdrD3_Vang_C-A],[fdrD3_PF_A],[fdrD3_PF_B],[fdrD3_PF_C],[fdrD3_PF],[fdrD3_Pst_V_A],[fdrD3_Pst_V_B],[fdrD3_Pst_V_C],[fdrD3_Plt_V_A],[fdrD3_Plt_V_B],[fdrD3_Plt_V_C],[fdrD3_Vdev_A],[fdrD3_Vdev_B],[fdrD3_Vdev_C],[fdrD3_Fdev],[fdrD3_THD_I_A],[fdrD3_THD_I_B],[fdrD3_THD_I_C],[fdrD3_THD_I_N],[fdrD3_THD_V_A],[fdrD3_THD_V_B],[fdrD3_THD_V_C]) 
    ) as pvt 

ORDER BY DatapointDate 
FOR XML Path('DataRow'), ROOT 

還有一個問題:你真的需要ORDER BY DatapointDate那裏嗎?