Postgres查詢優化限制（已使用僅索引掃描）

我有一個Postgres查詢已被優化，但我們在高峯負載下達到100％的CPU使用率，所以我想看看是否還有更多但是要優化數據庫交互。它已經在連接中使用兩個僅索引掃描，所以我的問題是Postgres方面還有很多工作要做。Postgres查詢優化限制（已使用僅索引掃描）

該數據庫是運行9.4.1的Amazon託管的Postgres RDS db.m3.2xlarge實例（8個vCPU和30 GB內存），下面的結果來自CPU使用率低且連接最少的時段（大約15）。高峯使用率大約是300個同時連接，這就是我們最大限度地利用我們的CPU（這會導致所有事情都會導致性能下降）。

這裏的查詢和解釋道：

查詢：

EXPLAIN (ANALYZE, BUFFERS) 

SELECT m.valdate, p.index_name, m.market_data_closing, m.available_date 
FROM md.market_data_closing m 
JOIN md.primitive p on (m.primitive_id = p.index_id) 
where p.index_name = ? 
order by valdate desc 

;

輸出：

Sort (cost=183.80..186.22 rows=967 width=44) (actual time=44.590..54.788 rows=11133 loops=1) 
    Sort Key: m.valdate 
    Sort Method: quicksort Memory: 1254kB 
    Buffers: shared hit=181 
    -> Nested Loop (cost=0.85..135.85 rows=967 width=44) (actual time=0.041..32.853 rows=11133 loops=1) 
     Buffers: shared hit=181 
     -> Index Only Scan using primitive_index_name_index_id_idx on primitive p (cost=0.29..4.30 rows=1 width=25) (actual time=0.018..0.019 rows=1 loops=1) 
       Index Cond: (index_name = '?'::text) 
       Heap Fetches: 0 
       Buffers: shared hit=3 
     -> Index Only Scan using market_data_closing_primitive_id_valdate_available_date_mar_idx on market_data_closing m (cost=0.56..109.22 rows=2233 width=27) (actual time=0.016..12.059 rows=11133 loops=1) 
       Index Cond: (primitive_id = p.index_id) 
       Heap Fetches: 42 
       Buffers: shared hit=178 
Planning time: 0.261 ms 
Execution time: 64.957 ms

下面是表大小：

md.primitive：14283行
md.market_data_closing：13544087行

僅供參考，這裏是表和索引的基本規格：

CREATE TABLE md.primitive(
    index_id serial NOT NULL, 
    index_name text NOT NULL UNIQUE, 
    index_description text not NULL, 
    index_source_code text NOT NULL DEFAULT 'MAN', 
    index_source_spec json NOT NULL DEFAULT '{}', 
    frequency text NULL, 
    primitive_type text NULL, 
    is_maintained boolean NOT NULL default true, 
    create_dt timestamp NOT NULL, 
    create_user text NOT NULL, 
    update_dt timestamp not NULL, 
    update_user text not NULL, 
PRIMARY KEY 
(
    index_id 
) 
) ; 

CREATE INDEX ON md.primitive 
(
    index_name ASC, 
    index_id ASC 
); 

CREATE TABLE md.market_data_closing(
    valdate timestamp NOT NULL, 
    primitive_id int references md.primitive, 
    market_data_closing decimal(28, 10) not NULL, 
    available_date timestamp NULL, 
    pricing_source text not NULL, 
    create_dt timestamp NOT NULL, 
    create_user text NOT NULL, 
    update_dt timestamp not NULL, 
    update_user text not NULL, 
PRIMARY KEY 
(
    valdate, 
    primitive_id 
) 
) ; 

CREATE INDEX ON md.market_data_closing 
(
    primitive_id ASC, 
    valdate DESC, 
    available_date DESC, 
    market_data_closing ASC 
);

還有什麼可以做什麼？

來源

2015-11-04 pyfi

64毫秒對我來說似乎相當快。你需要多快？ –

8CPU上的300個並列連接太多 - 對於這個數字，100連接是最優的。你需要更多的CPU。 –

我同意帕維爾：有時你可以通過同時減少工作來獲得表現。我們有一些Web應用程序，我們大大減少了連接池的大小（從> 250到50），並且吞吐量有了大幅增加。 –

看來嵌套循環佔用了荒謬的時間，原始表只返回一行。您可以嘗試做這樣的事情消除嵌套循環：

SELECT m.valdate, m.market_data_closing, m.available_date 
FROM md.market_data_closing m 
WHERE m.primitive_id = (SELECT p.index_id 
         FROM md.primitive p 
         WHERE p.index_name = ? 
         OFFSET 0 -- probably not needed, try it) 
ORDER BY valdate DESC;

這不返回p.index_name但可以通過選擇它作爲一個常量很容易解決。

來源

2015-11-04 20:48:03

Postgres查詢優化限制（已使用僅索引掃描）

回答

相關問題