2016-11-10 107 views
1

我有以下存儲過程來生成動態查詢。硬編碼函數參數產生5倍加速

給定一個條件/過濾器列表,它找到屬於給定App的所有Visitorsapp_id作爲參數傳入。

如果我使用應用程序標識調用函數,並在動態查詢中使用此參數,它將在200ms左右運行。

但是,如果我硬編碼app_id,它運行在< 20ms。

這裏我如何調用該過程

SELECT id 
FROM find_matching_visitors('my_app_id', '{}', '{(field = ''app_name'' and string_value ILIKE ''My awesome app'' )}') 

任何想法,想法,爲什麼一個例子嗎?

CREATE OR REPLACE FUNCTION find_matching_visitors(app_id text, default_filters text[], custom_filters text[]) 
    RETURNS TABLE (
     id varchar 
    ) AS 
    $body$ 
    DECLARE 
     default_filterstring text; 
     custom_filterstring text; 
     default_filter_length integer; 
     custom_filter_length integer; 
     sql VARCHAR; 
    BEGIN 
     default_filter_length := COALESCE(array_length(default_filters, 1), 0); 
     custom_filter_length := COALESCE(array_length(custom_filters, 1), 0); 

     default_filterstring := array_to_string(default_filters, ' AND '); 
     custom_filterstring := array_to_string(custom_filters, ' OR '); 

     IF custom_filterstring = '' or custom_filterstring is null THEN 
      custom_filterstring := '1=1'; 
     END IF; 

     IF default_filterstring = '' or default_filterstring is null THEN 
      default_filterstring := '1=1'; 
     END IF; 

     sql := format(' 
        SELECT v.id FROM visitors v 
        LEFT JOIN trackings t on v.id = t.visitor_id 
        WHERE v.app_id = ''HARDCODED_APP_ID'' and (%s) and (%s) 
        group by v.id 

       ', custom_filterstring, default_filterstring, custom_filter_length, custom_filter_length); 
     RETURN QUERY EXECUTE sql; 

    END; 
    $body$ 
    LANGUAGE 'plpgsql'; 

分析沒有硬編碼硬編碼APP_ID

Limit (cost=481.86..481.99 rows=50 width=531) (actual time=25.890..25.893 rows=9 loops=1) 
2  -> Sort (cost=481.86..484.26 rows=960 width=531) (actual time=25.888..25.890 rows=9 loops=1) 
3   Sort Key: v0.last_seen DESC 
4   Sort Method: quicksort Memory: 30kB 
5   -> WindowAgg (cost=414.62..449.97 rows=960 width=531) (actual time=25.862..25.870 rows=9 loops=1) 
6     -> Hash Join (cost=414.62..437.97 rows=960 width=523) (actual time=25.830..25.841 rows=9 loops=1) 
7      Hash Cond: ((find_matching_visitors.id)::text = (v0.id)::text) 
8      -> Function Scan on find_matching_visitors (cost=0.25..10.25 rows=1000 width=32) (actual time=15.875..15.876 rows=9 loops=1) 
9      -> Hash (cost=354.19..354.19 rows=4814 width=523) (actual time=9.936..9.936 rows=4887 loops=1) 
10       Buckets: 8192 Batches: 1 Memory Usage: 2145kB 
11       -> Seq Scan on visitors v0 (cost=0.00..354.19 rows=4814 width=523) (actual time=0.013..5.232 rows=4887 loops=1) 
12         Filter: ((NOT merged) AND (((type)::text = 'user'::text) OR ((type)::text = 'lead'::text))) 
13         Rows Removed by Filter: 138 
14 Planning time: 0.772 ms 
15 Execution time: 26.006 ms 

更新1時APP_ID

Limit (cost=481.86..481.99 rows=50 width=531) (actual time=163.579..163.581 rows=9 loops=1) 
2  -> Sort (cost=481.86..484.26 rows=960 width=531) (actual time=163.578..163.579 rows=9 loops=1) 
3   Sort Key: v0.last_seen DESC 
4   Sort Method: quicksort Memory: 30kB 
5   -> WindowAgg (cost=414.62..449.97 rows=960 width=531) (actual time=163.553..163.560 rows=9 loops=1) 
6     -> Hash Join (cost=414.62..437.97 rows=960 width=523) (actual time=163.525..163.537 rows=9 loops=1) 
7      Hash Cond: ((find_matching_visitors.id)::text = (v0.id)::text) 
8      -> Function Scan on find_matching_visitors (cost=0.25..10.25 rows=1000 width=32) (actual time=153.918..153.918 rows=9 loops=1) 
9      -> Hash (cost=354.19..354.19 rows=4814 width=523) (actual time=9.578..9.578 rows=4887 loops=1) 
10       Buckets: 8192 Batches: 1 Memory Usage: 2145kB 
11       -> Seq Scan on visitors v0 (cost=0.00..354.19 rows=4814 width=523) (actual time=0.032..4.993 rows=4887 loops=1) 
12         Filter: ((NOT merged) AND (((type)::text = 'user'::text) OR ((type)::text = 'lead'::text))) 
13         Rows Removed by Filter: 138 
14 Planning time: 1.134 ms 
15 Execution time: 163.705 ms 

分析:增加了兩種情況解釋。注意:他們實際上是完全相同的計劃,只花費時間不同

更新2:事實證明,我需要將app_id作爲參數傳遞給格式函數,而不是直接嵌入它。這將查詢時間縮短到20/30ms左右

+0

PostgreSQL版本? –

+0

使用9.5版 – Tarlen

+0

EXPLAIN ANALYSE必須說什麼? –

回答

2

硬編碼值對於確定最優查詢計劃很重要。 例如:

select * from some_table where id_person=231 
select * from some_table where id_person=10 

當some_table的90%已id_person = 231微克使用全表掃描,因爲這是最快的。 當1%的記錄有id_person = 10時,它使用索引掃描。 所以使用的計劃取決於參數的值。

當您使用非硬編碼值時,例如

select * from some_table where id_person=? 

它無法確定最優化的查詢計劃,查詢速度可能會變慢。

+0

可以看到解釋分析,他們使用完全相同的查詢計劃 – Tarlen

+0

@Tarlen:這是關於語句_inside_函數的執行計劃,而不是使用函數 –

+0

好的,我發現問題並更新了文章 – Tarlen

相關問題