2017-06-22 98 views
0

我在Postgresql中遇到了一些問題。此查詢需要很長的時間來執行(無緩衝約30秒) 我的查詢是在這裏:Postgresql LARGE查詢優化

SELECT d.name, COUNT (*) AS cnt, 
      'first' AS TYPE 
     FROM 
      tableA a 
     INNER JOIN tableD d ON d.NAME = 'FOO' 
     AND a.key = d.key 
     WHERE 
      a.DATE > '2017-06-01' 
     AND a.DATE < '2017-07-01' 
     group by d.name 
UNION ALL 
    SELECT 
     d.name, 
     COUNT (*) AS cnt, 
     'second' AS TYPE 
    FROM 
     tableB b 
    INNER JOIN tableD d ON d.NAME = 'FOO' 
    AND b.key = d.key 
    WHERE 
     b.DATE > '2017-06-01' 
    AND b.DATE < '2017-07-01' 
    group by d.name 
UNION ALL 
    SELECT 
     d.name, 
     COUNT (*) AS cnt, 
     'Third' AS TYPE 
    FROM 
     tableC c 
    INNER JOIN tableD d ON d.NAME = 'FOO' 
    AND c.key = d.key 
    WHERE 
     c.date > '2017-06-01' 
    AND c.date < '2017-07-01' 
    group by d.name 

我創建了tableC.key(B樹)索引和tableC.name(哈希) 而且其他表對日期和鍵(B樹)索引

所以我的查詢可以通過索引加入,並且可以通過指標篩選

我提出有幾千行,別人有幾十億或幾乎百億

在Ë xecution計劃我看到執行人使用嵌套循環中的所有我的連接(預計一個在BD加盟,有一個哈希聯接)

也許我找到了「背叛者」

Node Type": "Bitmap Heap Scan", 
     "Parent Relationship": "Inner", 
     "Relation Name": "tableA", 
     "Alias": "a", 
     "Startup Cost": 2469.84, 
     "Total Cost": 137625.61, 
     "Plan Rows": 53748, 
     "Plan Width": 37, 
     "Recheck Cond": "(((key)::text = (d.key)::text) AND (date > '2017-06-01 00:00:00'::timestamp without time zone) AND (date < '2017-07-01 00:00:00'::timestamp without time zone))", 
       "Plans": [{ 
        "Node Type": "Bitmap Index Scan", 
        "Parent Relationship": "Outer", 
        "Index Name": "\"date + key\"", 
        "Startup Cost": 0.00, 
        "Total Cost": 2456.40, 
        "Plan Rows": 53748, 
        "Plan Width": 0, 
        "Index Cond": "(((key)::text = (d.key)::text) AND (date > '2017-06-01 00:00:00'::timestamp without time zone) AND (date < '2017-07-01 00:00:00'::timestamp without time zone))" 
          }] 

提出:

CREATE TABLE "sch"."tableD" (
    "id" int4 NOT NULL, 
    "key" varchar(36) COLLATE "default", 
    "name" varchar(255) COLLATE "default", 


    CREATE INDEX "license_key" ON "sch"."tableD" USING btree ("key"); 
    CREATE INDEX "name" ON "sch"."tableD" USING btree ("name"); 

表A:

CREATE TABLE "sch"."tableA" (
    "id" int4 DEFAULT nextval('"sch".table'::regclass) NOT NULL, 
    "key" varchar(255) COLLATE "default", 
    "date" timestamp(6), 

    CREATE INDEX "date" ON "sch"."tableA" USING btree ("date"); 
    CREATE INDEX "date + key" ON "sch"."tableA" USING btree ("key", "date") 
    CREATE INDEX "keyIndex" ON "sch"."tableA" USING btree ("key"); 

表B和C相似甲

我不知道,爲什麼我在這裏失去了時間。你能幫我解決我的問題,這查詢不應該運行30秒 謝謝

+0

開始通過測量每個子查詢需要多長時間。然後你可以縮小性能問題。 –

+0

不確定,但在我看來,我們可以消除工會和使用窗函數得到計數有1個查詢。和一個case語句來設置類型和外部連接。 – xQbert

+0

第一子查詢花費的時間最長,但最行是在表A,所以我可以想像這可能會導致查詢的放緩 如果我消除我的工會執行者可以選擇散列連接(或合併聯接,如果我上的按鍵使用哈希索引),但它是更慢(100-120秒) –

回答

0

提供這些B樹指數(哈希):

b: (DATE, key) 
b: (key, DATE) 
d: (NAME, key) 
d: (key, NAME) 

它看起來像一個月的時間跨度,但你排除了月初。將>更改爲>=