我有一個PostgreSQL數據庫9.3兩個表:PostgreSQL的連接性能
表A(ID數字,數字值(5,4)) - 約1milion記錄。
表B(ID數字,字段2 ..字段N) - 約5百萬條記錄。
我試圖做一個簡單的JOIN表A & TableB ID字段,按字段VALUE過濾。 雖然VALUE可以有4位小數,但應用的過濾器只能有兩位小數。
我的SQL看起來像這樣(0.45過濾參數,從0.00至1.00):
SELECT A.ID, A.VALUE, B.a_lot_of_fields_here
FROM TableA A, TableB B
WHERE A.ID = B.ID
and A.VALUE >= 0.45
兩個標識(A.ID,B.ID)的PK。我想知道如果我能做些什麼來加速更多的SQL。有什麼建議麼?
這是在查詢說明計劃:
"Hash Join (cost=22205.73..310550.04 rows=1167395 width=105)"
" Hash Cond: ((a.id)::text = (b.id)::text)"
" -> Seq Scan on tablea a (cost=0.00..140163.94 rows=3557794 width=80)"
" -> Hash (cost=17404.18..17404.18 rows=248284 width=25)"
" -> Bitmap Heap Scan on tableb b (cost=4652.63..17404.18 rows=248284 width=25)"
" Recheck Cond: (value >= 0.1)"
" -> Bitmap Index Scan on index_test (cost=0.00..4590.56 rows=248284 width=0)"
" Index Cond: (value >= 0.1)"
,是其中一種ANALYZE /緩衝器
"Hash Join (cost=22205.73..315312.78 rows=1398289 width=109) (actual time=2065.165..12794.984 rows=1267024 loops=1)"
" Hash Cond: ((a.id)::text = (b.id)::text)"
" Buffers: shared hit=177 read=71080 written=47217, temp read=47454 written=47424"
" -> Seq Scan on tablea a (cost=0.00..107631.74 rows=4261474 width=84) (actual time=0.014..2815.098 rows=3557794 loops=1)"
" Buffers: shared hit=175 read=64842 written=43965"
" -> Hash (cost=17404.18..17404.18 rows=248284 width=25) (actual time=2047.615..2047.615 rows=248617 loops=1)"
" Buckets: 2048 Batches: 16 Memory Usage: 901kB"
" Buffers: shared hit=2 read=6238 written=3252, temp written=1319"
" -> Bitmap Heap Scan on tableb b (cost=4652.63..17404.18 rows=248284 width=25) (actual time=491.395..1914.202 rows=248617 loops=1)"
" Recheck Cond: (value >= 0.1)"
" Buffers: shared hit=2 read=6238 written=3252"
" -> Bitmap Index Scan on index_test (cost=0.00..4590.56 rows=248284 width=0) (actual time=448.286..448.286 rows=248617 loops=1)"
" Index Cond: (value >= 0.1)"
" Buffers: shared read=682"
"Total runtime: 12905.306 ms"
現在我正在A.ID和B.ID字段:它們是文本字段......我將把這些文本編碼爲整數,以查看使用此方法可以獲得多少性能增益。
Postgres配置的東西:
我查看了postresql.conf。我有一個默認值所有參數:( - 我沒有這方面的知識,他們 - 調整隻是一個編輯(由系統管理員):
shared_buffers = 128MB
你有什麼索引表上? – Dan
索引因爲'TableB.ID'是聚集索引無論如何你不需要對這個表進行任何修改 – GarethD
這兩個表都有ID(它們是PK)的索引,jo在應用0.1的過濾器時,返回1.250.000記錄的成本約爲9秒。看起來像一個可改變的時間? – dbermudez