2015-12-07 94 views
0

我聽說PostgreSQL被用於表中數十億行並且響應時間也滿意的情況。但這是我簡單的實驗來檢查這一點。我有一張6列的表,並且有條目。我已經使用pgtune根據我的設置調整配置。現在,當我運行一個簡單的「* select * from tab1 *」時,它將花費173.425秒來獲取所有行。這是正常的行爲?我在DB中有這個單獨的表。PostgreSQL性能檢查

表定義爲如下 -

CREATE TABLE file_group_permissions 
(
    fgp_id serial NOT NULL, 
    file_id integer NOT NULL, 
    pg_id integer NOT NULL, 
    policy_id integer, 
    tag_id integer, 
    inst_id integer, 
    CONSTRAINT file_group_permissions_pkey PRIMARY KEY (fgp_id) 
) 
WITH (
    OIDS=FALSE 
); 
ALTER TABLE file_group_permissions 
    OWNER TO sa; 

-- Index: fgp_file_idx 

-- DROP INDEX fgp_file_idx; 

CREATE INDEX fgp_file_idx 
    ON file_group_permissions 
    USING btree 
    (file_id); 
ALTER TABLE file_group_permissions CLUSTER ON fgp_file_idx; 

-- Index: fgp_inst_idx 

-- DROP INDEX fgp_inst_idx; 

CREATE INDEX fgp_inst_idx 
    ON file_group_permissions 
    USING btree 
    (inst_id); 

-- Index: fgp_tag_idx 

-- DROP INDEX fgp_tag_idx; 

CREATE INDEX fgp_tag_idx 
    ON file_group_permissions 
    USING btree 
    (tag_id); 

-- Index: pgfgp_idx 

-- DROP INDEX pgfgp_idx; 

CREATE INDEX pgfgp_idx 
    ON file_group_permissions 
    USING btree 
    (pg_id); 

輸出EXPLAIN(ANALYZE,緩衝器)中選擇的*從file_group_permissions -

"Seq Scan on file_group_permissions (cost=0.00..69662.00 rows=4255700 
width=24) (actual time=0.019..580.273 rows=4255700 loops=1)" 

" Buffers: shared hit=2432 read=24673" 

"Planning time: 0.070 ms" 

"Execution time: 903.325 ms" 

我有一個的MacBook Pro的RAM 16場演出和512 Gig的SSD。我已經配置PostgreSQL使用2Gigs的RAM。

編輯

EXPLAIN (ANALYZE, BUFFERS) select pg_id, count(distinct file_id) from file_group_permissions where pg_id in (6117,6115,6116,6113,6114) group by 1; 


"GroupAggregate (cost=0.44..102028.21 rows=208 width=8) (actual time=4970.884..5013.423 rows=3 loops=1)" 
" Group Key: pg_id" 
" Buffers: shared hit=50891, temp read=4824 written=4824" 
" -> Index Scan using pgfgp_idx on file_group_permissions (cost=0.44..85511.31 rows=3302964 width=8) (actual time=0.062..1080.926 rows=3323389 loops=1)" 
"  Index Cond: (pg_id = ANY('{6117,6115,6116,6113,6114}'::integer[]))" 
"  Buffers: shared hit=50891" 
"Planning time: 0.219 ms" 
"Execution time: 5013.495 ms" 

EDIT1

我分開此表到一個新的數據庫,並遵循建議(綜合指數和PostgreSQL的conf),這裏的新計劃 -

"GroupAggregate (cost=478307.10..502996.67 rows=209 width=8) (actual 
time=7500.426..7528.021 rows=3 loops=1)" 


" Group Key: pg_id" 

" Buffers: shared read=27105, temp read=12137 written=12137" 

" -> Sort (cost=478307.10..486536.26 rows=3291664 width=8) (actual 
time=2944.597..3647.248 rows=3323389 loops=1)" 

"  Sort Key: pg_id" 

"  Sort Method: external sort Disk: 58488kB" 

"  Buffers: shared read=27105, temp read=7311 written=7311" 

"  -> Seq Scan on file_group_permissions (cost=0.00..96260.12 
rows=3291664 width=8) (actual time=0.016..1516.743 rows=3323389 loops=1)" 


"    Filter: (pg_id = ANY 
('{6117,6115,6116,6113,6114}'::integer[]))" 

"    Rows Removed by Filter: 932311" 

"    Buffers: shared read=27105" 

"Planning time: 0.514 ms" 

"Execution time: 7540.243 ms" 

這張桌子簡直是瘋了,它傷害了表演它正在加入的所有地方。

+2

將4255700行從服務器傳輸到客戶端**將需要一些時間。畢竟這需要通過客戶端軟件發送,接收和處理(並可能顯示)大約200MB。 –

+0

這個問題似乎更適合[dba.se]。 –

+0

@a_horse_with_no_name我明白了,但是173秒?必須有一個答案。 –

回答

1

我創建了一個測試表,就像你的,我有完全相同數量的記錄填充它:

insert into file_group_permissions (file_id,pg_id,policy_id,tag_id,inst_id) 
select 
    trunc(random()*10000) as file_id, 
    trunc(random()*10000) as pg_id, 
    trunc(random()*10000) as policy_id, 
    trunc(random()*10000) as tag_id, 
    trunc(random()*10000) as inst_id 
from generate_series(1,4255700) g 

當我運行查詢,它執行相當快:

EXPLAIN (ANALYZE, BUFFERS) 
select pg_id, count(distinct file_id) 
from file_group_permissions 
where pg_id in (6117,6115,6116,6113,6114) group by 1; 

GroupAggregate (cost=0.43..8.32 rows=5 width=8) (actual time=0.339..1.608 rows=5 loops=1) 
    Buffers: shared hit=2158 
    -> Index Scan using pgfgp_idx on file_group_permissions (cost=0.43..8.24 rows=5 width=8) (actual time=0.018..1.170 rows=2147 loops=1) 
     Index Cond: (pg_id = ANY ('{6117,6115,6116,6113,6114}'::integer[])) 
     Buffers: shared hit=2158 
Total runtime: 1.633 ms 

我已經注意到這條線在你執行計劃:

Buffers: shared hit=50891, temp read=4824 written=4824 

temp read=4824 written=4824告訴我們數據庫以某種方式「使用」磁盤來執行掃描操作。也許你必須調整一些其他的postgresql.conf參數,像這些我的:

shared_buffers  = 1GB 
temp_buffers   = 32MB 
work_mem    = 32MB 
effective_cache_size = 1GB 
+0

類似的這裏:http://explain.depesz.com/s/H7R2 –

+0

發佈了我的新結果並記住了所有的建議。 –

+0

而大家,你在表中沒有任何數據匹配,所以查詢速度非常快。 –