測試表和索引(PostgreSQL的9.5.3):與單項指標的使用ANY(ARRAY [...])
CREATE TABLE public.t (id serial, a integer, b integer);
INSERT INTO t(a, b)
SELECT round(random()*1000), round(random()*1000)
FROM generate_series(1, 1000000);
CREATE INDEX "i_1" ON public.t USING btree (a, b);
CREATE INDEX "i_2" ON public.t USING btree (b);
如果「A = 50」在第一個查詢,一切都確定,適當的索引 「I_1」 用於:
SELECT * FROM t WHERE a = 50 ORDER BY b LIMIT 1
"Limit (cost=0.42..4.03 rows=1 width=12) (actual time=0.085..0.085 rows=1 loops=1)"
" Buffers: shared hit=1 read=3"
" -> Index Scan using i_1 on t (cost=0.42..4683.12 rows=1300 width=12) (actual time=0.084..0.084 rows=1 loops=1)"
" Index Cond: (a = 50)"
" Buffers: shared hit=1 read=3"
"Planning time: 0.637 ms"
"Execution time: 0.114 ms"
隨着「一個IN(50)」結果是相同的:
SELECT * FROM t WHERE a IN (50) ORDER BY b LIMIT 1
"Limit (cost=0.42..4.03 rows=1 width=12) (actual time=0.058..0.058 rows=1 loops=1)"
" Buffers: shared hit=4"
" -> Index Scan using i_1 on t (cost=0.42..4683.12 rows=1300 width=12) (actual time=0.056..0.056 rows=1 loops=1)"
" Index Cond: (a = 50)"
" Buffers: shared hit=4"
"Planning time: 0.287 ms"
"Execution time: 0.105 ms"
問題是當我嘗試使用「a = ANY(ARRAY [50])」。索引錯誤「I_2」代替「I_1」和執行時間變得更長X25:
SELECT * FROM t WHERE a = ANY(ARRAY[50]) ORDER BY b LIMIT 1
"Limit (cost=0.42..38.00 rows=1 width=12) (actual time=2.591..2.591 rows=1 loops=1)"
" Buffers: shared hit=491 read=4"
" -> Index Scan using i_2 on t (cost=0.42..48853.65 rows=1300 width=12) (actual time=2.588..2.588 rows=1 loops=1)"
" Filter: (a = ANY ('{50}'::integer[]))"
" Rows Removed by Filter: 520"
" Buffers: shared hit=491 read=4"
"Planning time: 0.251 ms"
"Execution time: 2.627 ms"
你可以說:「如果你使用任何(ARRAY [])PostgreSQL的不能使用索引」,但實際上它可以。如果我刪除 「ORDER BY」 它再次工作:
SELECT * FROM t WHERE a = ANY(ARRAY[50]) LIMIT 1
"Limit (cost=0.42..4.03 rows=1 width=12) (actual time=0.034..0.034 rows=1 loops=1)"
" Buffers: shared hit=4"
" -> Index Scan using i_1 on t (cost=0.42..4683.12 rows=1300 width=12) (actual time=0.033..0.033 rows=1 loops=1)"
" Index Cond: (a = ANY ('{50}'::integer[]))"
" Buffers: shared hit=4"
"Planning time: 0.182 ms"
"Execution time: 0.090 ms"
我的問題:
如果PostgreSQL是足夠聰明地用 「IN」,什麼是與任何問題很好地工作(ARRAY [])?
如果我刪除了「ORDER BY」子句,它爲什麼能與ANY(ARRAY [])一起使用?
爲什麼使用「IN」?如果這是更優化的,那麼使用「IN」而不是「ANY(ARRAY [])」是否是好習慣? 「結果不會按b排序」 - 您能提供一些參考文件來證實這一點嗎? –
我已經擴展瞭解決這些問題的答案。 –
這只是關於一個有1個元素的數組,在任何情況下,如果沒有真正的原因,爲什麼IN的工作方式不同(在這種情況下更好)比ANY(ARRAY []),那麼這很混亂。 –