我有一個非常大型的網絡論壇應用程序(自2001年以來約有2000萬個帖子)從SQL Server 2012數據庫運行。數據文件大小約爲40GB。SQL Server - 嵌套查詢需要40分鐘才能運行
我添加索引爲相應字段的表,但是這個查詢(揭示帖子的日期範圍,每個論壇)大約40分鐘運行:
SELECT
T2.ForumId,
Forums.Title,
T2.ForumThreads,
T2.ForumPosts,
T2.ForumStart,
T2.ForumStop
FROM
Forums
INNER JOIN (
SELECT
Min(ThreadStart) As ForumStart,
Max(ThreadStop) As ForumStop,
Count(*) As ForumThreads,
Sum(ThreadPosts) As ForumPosts,
Threads.ForumId
FROM
Threads
INNER JOIN (
SELECT
Min(Posts.DateTime) As ThreadStart,
Max(Posts.DateTime) As ThreadStop,
Count(*) As ThreadPosts,
Posts.ThreadId
FROM
Posts
GROUP BY
Posts.ThreadId
) As P2 ON Threads.ThreadId = P2.ThreadId
GROUP BY
Threads.ForumId
) AS T2 ON T2.ForumId = Forums.ForumId
我怎麼能加快步伐?
UPDATE:
這是估計的執行計劃,由右至左:
[Path 1]
Clustered Index Scan (Clustered) [Posts].[PK_Posts], Cost: 98%
Hash Match (Partial Aggregate), Cost: 2%
Parallelism (Repartition Streams), Cost: 0%
Hash Match (Aggregate), Cost 0%
Compute Scalar, Cost: 0%
Bitmap (Bitmap Create), Cost: 0%
[Path 2]
Index Scan (NonClustered) [Threads].[IX_ForumId], Cost: 0%
Parallelism (Repartition Streams), Cost: 0%
[Path 1 and 2 converge into Path 3]
Hash Match (Inner Join), Cost: 0%
Hash Match (Partial Agregate), Cost: 0%
Parallelism (Repartition Streams), Cost: 0%
Sort, Cost: 0%
Stream Aggregate (Aggregate), Cost: 0%
Compute Scalar, Cost: 0%
[Path 4]
Clustered Index Seek (Clustered) [Forums].[PK_Forums], Cost: 0%
[Path 3 and 4 converge into Path 5]
Nested Loops (Inner Join), Cost: 0%
Paralleism (Gather Streams), Cost: 0%
SELECT, Cost: 0%
查詢的執行計劃是什麼樣的? – Taryn
40Gig?並不罕見..添加索引! – mschr
使這些「掃描」成爲「尋找」,它會更好 - 通過添加,更改索引。你可能想把表分成分區。 – SQLMason