2016-06-24 110 views
0

我有一張表跨越2億條記錄,我試圖運行下面的查詢。該查詢嘗試根據上一條記錄的時間戳更新表。無論如何,這個查詢運行得更快嗎?調整交叉記錄更新查詢

UPDATE [dbo].[Location Data] 
    SET [timestamp_prev] = 
    (
      SELECT [timestamp] FROM [dbo].[Location Data] newTable 
       WHERE [dbo].[Location Data].[RowNumber] = (newTable.[RowNumber] + 1) 
       AND [dbo].[Location Data].[mmsi] = newTable.[mmsi] 
    ); 
+0

檢查您的查詢計劃:它實際上執行相關的子查詢,還是將它轉換爲自連接?如果沒有,你應該這樣做。 – Blorgbeard

+0

您使用的是哪個版本的'SQL Server' –

回答

2

你可以嘗試使用自聯接:

UPDATE 
    t1 
SET 
    t1.[timestamp_prev] = t2.[timestamp] 
FROM 
    [dbo].[Location Data] t1 
INNER JOIN 
    [dbo].[Location Data] t2 
    ON t1.[RowNumber] = t2.[RowNumber] + 1 AND 
     t1.[mmsi] = t2.[mmsi] 

如果您對連接列此查詢可能之前完成指標你退休了。

0

像下面這樣的內部連接可能會幫助您,而不是像在嵌套查詢中那樣遍歷表的每一行中的所有行。

UPDATE oldTable 
SET oldTable.[timestamp_prev] = newTable.[timestamp] 
FROM [dbo].[Location Data] oldTable 
INNER JOIN [dbo].[Location Data] newTable 
    ON oldTable.[RowNumber] = newTable.[RowNumber] + 1 
       AND oldTable.[mmsi] = newTable.[mmsi] 
0

我會嘗試這樣的:

UPDATE T1 SET 
    [timestamp_prev] = T2.[timestamp] 
FROM [dbo].[Location Data] T1 
    INNER JOIN [dbo].[Location Data] T2 
     ON T1.RowNumber = T2.RowNumber + 1 
      AND T1.mmsi = T2.mmsi 
WHERE T1.[timestamp_prev] IS NULL; 

的加入應該是更有效的,只有嘗試更新沒有以前的時間戳記錄。然後,您可以採取另一個步驟,將RowNumber,MMSI和Timestamp_Prev上的索引添加到表中,然後確保清理索引以最大限度地提高效率。

這樣一個簡單的指標應該是一個良好的開端:

CREATE NONCLUSTERED INDEX ix_Location_Data_MMSI_RowNumber_Timestamp_Prev 
    ON dbo.[Location Data] (mmsi, RowNumber, Timestamp_Prev) INCLUDE (Timestamp); 
2

首先,我會用lag()做到這一點:

with toupdate as (
     select ld.*, 
      lag(timestamp) over (partition by mmsi order by RowNumber) as prev_timestamp 
     from dbo.[Location Data] ld 
    ) 
update toupdate 
    set timestamp_prev = prev_timetamp; 

然後,我要指出,更新2億條記錄是要花費很長,很長很長的時間。我建議你用你想要的列生成一個新表,然後截斷原始表並重新填充它。

+1

不錯的** **滯後**。不知道這甚至存在! – Sam

+1

@Sam [LAG](https://msdn.microsoft.com/en-IN/library/hh231256.aspx)在'SQL SERVER 2012'中引入了 –

+2

@Prdp我猜Sam是'LAG'ging在後面:-) –