我有一個表現不佳的SQL查詢。我已經對連接進行了一些研究,觀看了教程,確保了我定義了正確的索引等,但老實說,我已經對如何提高這種所謂的查詢的性能感到有點遺憾。優化SQL JOIN調用
我有以下模式定義:
create_table "training_plans", :force => true do |t|
t.integer "user_id"
end
add_index "training_plans", ["user_id"], :name => "index_training_plans_on_user_id"
create_table "training_weeks", :force => true do |t|
t.integer "training_plan_id"
t.date "start_date"
end
add_index "training_weeks", ["training_plan_id", "start_date"], :name => "index_training_weeks_on_training_plan_id_and_start_date"
add_index "training_weeks", ["training_plan_id"], :name => "index_training_weeks_on_training_plan_id"
create_table "training_efforts", :force => true do |t|
t.string "name"
t.date "plandate"
t.integer "training_week_id"
end
add_index "training_efforts", ["plandate"], :name => "index_training_efforts_on_plandate"
add_index "training_efforts", ["training_week_id", "plandate"], :name => "index_training_efforts_on_training_week_id_and_plandate"
add_index "training_efforts", ["training_week_id"], :name => "index_training_efforts_on_training_week_id"
然後將下面的號召收集所有與特定training_plan相關的training_efforts,包括所有相關的乘坐對象,其中training_effort plandates不到的目標日期範圍,排序結果。
tefts = self.training_efforts.includes(:rides).order("plandate ASC").where("plandate >= ? AND plandate <= ?",
beginning_date,
end_date)
這將產生以下查詢輸出:
TrainingEffort Load (3393.6ms) SELECT "training_efforts".* FROM "training_efforts"
INNER JOIN "training_weeks" ON "training_efforts"."training_week_id" = "training_weeks"."id"
WHERE "training_weeks"."training_plan_id" = 104
AND (plandate >= '2015-01-05' AND plandate <= '2016-01-03') ORDER BY plandate ASC
我相信,我已經定義了正確的索引。桌子並不大。然而,這需要花費大量的時間。作爲進一步的背景,這是在Heroku Postgres上。最後,我要提的是在我的開發系統,查詢比大多數(3.3ms),速度較慢,但仍然不近1000倍的任何地方比一般的慢...
預先感謝優化此查詢任何幫助。
UPDATE 下面是用於查詢的EXPLAIN輸出(我開發的系統上發佈):
explain SELECT "training_efforts".* FROM "training_efforts" INNER JOIN "training_weeks"
ON "training_efforts"."training_week_id" = "training_weeks"."id"
WHERE "training_weeks"."training_plan_id" = 7
AND (plandate >= '2015-01-05' AND plandate <= '2016-01-03') ORDER BY plandate ASC;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Sort (cost=430.52..432.04 rows=606 width=120)
Sort Key: training_efforts.plandate
-> Hash Join (cost=15.12..402.51 rows=606 width=120)
Hash Cond: (training_efforts.training_week_id = training_weeks.id)
-> Seq Scan on training_efforts (cost=0.00..377.25 rows=1089 width=120)
Filter: ((plandate >= '2015-01-05'::date) AND (plandate <= '2016-01-03'::date))
-> Hash (cost=11.86..11.86 rows=261 width=4)
-> Seq Scan on training_weeks (cost=0.00..11.86 rows=261 width=4)
Filter: (training_plan_id = 7)
更新2 嘗試不同的查詢,看看我的索引將被使用並注意與training_weeks相比(訓練週期數都是日期欄),有7倍的training_efforts,我會嘗試搜索training_week日期而不是training_effort日期,如下所示:
explain SELECT "training_efforts".* FROM "training_efforts" INNER JOIN "training_weeks"
ON "training_weeks"."id" = "training_efforts"."training_week_id"
WHERE "training_weeks"."id" IN (SELECT "training_weeks"."id" FROM "training_weeks"
WHERE "training_weeks"."training_plan_id" = 7 AND (start_date >= '2015-01-05' AND start_date <= '2016-01-03'))
ORDER BY plandate ASC;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=376.83..378.34 rows=602 width=120)
Sort Key: training_efforts.plandate
-> Nested Loop (cost=14.23..349.04 rows=602 width=120)
-> Hash Semi Join (cost=13.95..26.83 rows=86 width=8)
Hash Cond: (training_weeks.id = training_weeks_1.id)
-> Seq Scan on training_weeks (cost=0.00..10.69 rows=469 width=4)
-> Hash (cost=12.87..12.87 rows=86 width=4)
-> Bitmap Heap Scan on training_weeks training_weeks_1 (cost=5.37..12.87 rows=86 width=4)
Recheck Cond: ((training_plan_id = 7) AND (start_date >= '2015-01-05'::date) AND (start_date <= '2016-01-03'::date))
-> Bitmap Index Scan on index_training_weeks_on_training_plan_id_and_start_date (cost=0.00..5.35 rows=86 width=0)
Index Cond: ((training_plan_id = 7) AND (start_date >= '2015-01-05'::date) AND (start_date <= '2016-01-03'::date))
-> Index Scan using index_training_efforts_on_training_week_id on training_efforts (cost=0.28..3.68 rows=7 width=120)
Index Cond: (training_week_id = training_weeks.id)
這似乎稍好一些,但我仍然沒有把握確信這是最優化的...
我同意你的指標......爲什麼會這樣呢?我會嘗試另一種查詢格式,看看它是否使用索引....這三個表有成千上萬的行(5-30k)。他們已經有好幾個月了。剛剛分析報告說,他們在過去兩天被自動清洗。 –
分析後執行計劃(或速度)是否改變?進行真空分析非常重要,因爲優化器根據有關數據的統計信息進行優化。如果它認爲你的數據非常小,或者你會查詢大部分數據,它將完全忽略索引,因爲在這些情況下它們可能效率低下。 –
感謝joe和@khampson。我將此標記爲答案,因爲它最接近於解決問題。我需要等待幾天才能看到日誌,並對結果感到滿意。基本上,我將查詢改爲'tefts = TrainingEffort.includes(:rides).order(「plandate ASC」)。joins(:training_week).where(:training_weeks => {:id => self.training_weeks.where(「 start_date> =?AND start_date <=?「,beginning_date,end_date)})'。然後,在抽真空DB之後,我將Heroku上的數據庫從愛好升級爲標準。這個組合做了訣竅。 –