Postgresql CREATE TABLE AS具有多個WHERE等號

我正在嘗試創建一個新表，它是具有匹配主鍵的其他6個表的聚合總和。這一直拖延，如果我使用3個以上的輸入表：2-3表，但攤位上運行，否則當該腳本較快（< 5秒）：Postgresql CREATE TABLE AS具有多個WHERE等號

CREATE TABLE table_name AS SELECT table1.timestamp, table1.value + table2.value + table3.value + table4.value AS value FROM table1, table2, table3, table4 WHERE table1.timestamp=table2.timestamp AND table2.timestamp=table3.timestamp AND table3.timestamp=table4.timestamp;

問題。無論如何，我還沒有嘗試過超過5分鐘的時間，但這對我的目的來說太慢了。

表格說明：每個表格有6列的相同格式（其中2個是相關的）。主鍵是一個整數「timestamp」，「value」是一個實數。表格大小各不相同，但每個表格的懸停大約爲100k行/條目。這些表大多具有相同的主鍵，但每個表中缺少一些數據點，因此從新表中省略這些數據點至關重要。

有沒有什麼我做錯了，我該怎麼做才能讓它跑得快？

編輯：

PS：這裏是一個完整的「EXPLAIN ANALYZE」查詢的實際輸出：

eldb=# EXPLAIN ANALYZE CREATE TABLE test_table AS SELECT count1.timestamp, count 1.year, count1.month, count1.day, count1.period, count1.the_value + count2.the_value + count 3.the_value + count4.the_value + count5.the_value + count6.the_value AS the_value FROM "table_name-1" AS count 1, "table_name-2" AS count2, "table_name-3" AS count3, "table_name-4" AS count4, "table_name-5" AS count5, "table_name-6" AS count6 WHERE count1.timestamp=count 2.timestamp AND count2.timestamp=count3.timestamp AND count3.timestamp=count4.ti mestamp AND count4.timestamp=count5.timestamp AND count5.timestamp=count6.timest amp AND count1.timestamp>2012020000 AND count2.timestamp>2012020000 AND count3.t imestamp>2012020000 AND count4.timestamp>2012020000 and count5.timestamp>2012020 000 AND count6.timestamp>2012020000; QUERY PLAN -------------------------------------------------------------------------------- ------------------------------------------------------------------------------ Merge Join (cost=20323.61..153806457715456.50 rows=5592655588099248 width=44) (actual time=84.524..3310.692 rows=3410 loops=1) Merge Cond: (count1."timestamp" = count4."timestamp") -> Nested Loop (cost=10161.80..4417379579.26 rows=1057606343 width=40) (act ual time=44.597..1616.585 rows=3410 loops=1) Join Filter: (count2."timestamp" = count1."timestamp") -> Merge Join (cost=10161.80..101480.96 rows=6070522 width=16) (actua l time=43.648..48.950 rows=3410 loops=1) Merge Cond: (count2."timestamp" = count3."timestamp") -> Sort (cost=5080.90..5168.01 rows=34844 width=8) (actual time =25.608..25.804 rows=3410 loops=1) Sort Key: count2."timestamp" Sort Method: quicksort Memory: 256kB -> Seq Scan on "table_name-2" count2 (cost=0.00..1972.66 rows=34844 width=8) (actual time=0.064..23.297 rows=3410 loops=1) Filter: ("timestamp" > 2012020000) -> Materialize (cost=5080.90..5255.12 rows=34844 width=8) (actu al time=18.030..19.847 rows=3410 loops=1) -> Sort (cost=5080.90..5168.01 rows=34844 width=8) (actua l time=18.023..18.416 rows=3410 loops=1) Sort Key: count3."timestamp" Sort Method: quicksort Memory: 256kB -> Seq Scan on "table_name-3" count3 (cost=0.00..19 72.66 rows=34844 width=8) (actual time=0.023..16.294 rows=3410 loops=1) Filter: ("timestamp" > 2012020000) -> Materialize (cost=0.00..2351.88 rows=34844 width=24) (actual time= 0.000..0.147 rows=3410 loops=3410) -> Seq Scan on "table_name-1" count1 (cost=0.00..1972.66 rows=3 4844 width=24) (actual time=0.020..16.853 rows=3410 loops=1) Filter: ("timestamp" > 2012020000) -> Materialize (cost=10161.80..4007228099.11 rows=1057606343 width=24) (act ual time=39.917..1687.402 rows=3410 loops=1) -> Nested Loop (cost=10161.80..4004584083.26 rows=1057606343 width=24 ) (actual time=39.915..1685.956 rows=3410 loops=1) Join Filter: (count4."timestamp" = count6."timestamp") -> Merge Join (cost=10161.80..101480.96 rows=6070522 width=16) (actual time=38.689..44.309 rows=3410 loops=1) Merge Cond: (count4."timestamp" = count5."timestamp") -> Sort (cost=5080.90..5168.01 rows=34844 width=8) (actua l time=18.960..19.156 rows=3410 loops=1) Sort Key: count4."timestamp" Sort Method: quicksort Memory: 256kB -> Seq Scan on "table_name-4" count4 (cost=0.00..19 72.66 rows=34844 width=8) (actual time=0.059..17.271 rows=3410 loops=1) Filter: ("timestamp" > 2012020000) -> Materialize (cost=5080.90..5255.12 rows=34844 width=8) (actual time=19.717..21.826 rows=3410 loops=1) -> Sort (cost=5080.90..5168.01 rows=34844 width=8) (actual time=19.708..20.266 rows=3410 loops=1) Sort Key: count5."timestamp" Sort Method: quicksort Memory: 256kB -> Seq Scan on "table_name-5" count5 (cost=0. 00..1972.66 rows=34844 width=8) (actual time=0.034..18.001 rows=3410 loops=1) Filter: ("timestamp" > 2012020000) -> Materialize (cost=0.00..2283.88 rows=34844 width=8) (actual time=0.000..0.148 rows=3410 loops=3410) -> Seq Scan on "table_name-6" count6 (cost=0.00..1972.66 rows=34844 width=8) (actual time=0.036..17.785 rows=3410 loops=1) Filter: ("timestamp" > 2012020000) Total runtime: 3330.933 ms (40 rows)

這裏是表結構（同樣爲所有表）：

CREATE TABLE "table_name-6" ( "timestamp" integer NOT NULL, year integer NOT NULL, month integer NOT NULL, day integer NOT NULL, period integer NOT NULL, the_value real, CONSTRAINT "table_name-6_pkey" PRIMARY KEY ("timestamp") )

注意：實際的表名和值被重命名。而且，這個輸出只是實際表格大小的一小部分。

來源

2012-05-27 TimY

你想什麼如果一個特定的密鑰只存在於四個表中的一箇中，會發生什麼？ – wildplasser

我不希望該密鑰被包含在新表中（即完全跳過）。（ps：感謝您的快速響應！） – TimY

時間戳是每個tableX的主鍵？你有索引嗎？ BTW「時間戳」是PG中的保留字（類型）。最好避免它們作爲標識符。順便說一句：請添加一個查詢計劃。您可以在查詢前加上「解釋分析」來獲得。 – wildplasser

DROP SCHEMA tmp CASCADE; 
CREATE SCHEMA tmp ; 

set search_path='tmp'; 

SET random_page_cost=1; 

CREATE TABLE table_name1 
     (ztimestamp integer NOT NULL 
     , year integer NOT NULL 
     , month integer NOT NULL 
     , day integer NOT NULL 
     , period integer NOT NULL 
     , the_value real 
     , CONSTRAINT table_name1_pkey PRIMARY KEY (ztimestamp) 
     ) ; 

CREATE TABLE table_name2 
     (ztimestamp integer NOT NULL 
     , year integer NOT NULL 
     , month integer NOT NULL 
     , day integer NOT NULL 
     , period integer NOT NULL 
     , the_value real 
     , CONSTRAINT table_name2_pkey PRIMARY KEY (ztimestamp) 
     ) ; 


... similar for 3,4,5,6 ... 


INSERT INTO table_name1(ztimestamp,year,month,day,period,the_value) 
SELECT generate_series(1,2000), 0,0,0,0, 1.0; 
INSERT INTO table_name2 SELECT * FROM table_name1; 
INSERT INTO table_name3 SELECT * FROM table_name1; 
INSERT INTO table_name4 SELECT * FROM table_name1; 
INSERT INTO table_name5 SELECT * FROM table_name1; 
INSERT INTO table_name6 SELECT * FROM table_name1; 

EXPLAIN ANALYZE 
CREATE TABLE test_table AS 
SELECT c1.ztimestamp, c1.year, c1.month, c1.day, c1.period 
     , c1.the_value + c2.the_value + c3.the_value + c4.the_value 
     + c5.the_value + c6.the_value AS the_value 
FROM table_name1 AS c1 
     , table_name2 AS c2 
     , table_name3 AS c3 
     , table_name4 AS c4 
     , table_name5 AS c5 
     , table_name6 AS c6 
WHERE c1.ztimestamp=c2.ztimestamp 
AND c2.ztimestamp=c3.ztimestamp 
AND c3.ztimestamp=c4.ztimestamp 
AND c4.ztimestamp=c5.ztimestamp 
AND c5.ztimestamp=c6.ztimestamp 
    ;

結果& &計劃：INSERT 0 2000

INSERT 0 2000 
INSERT 0 2000 
INSERT 0 2000 
INSERT 0 2000 
INSERT 0 2000 
INSERT 0 2000 
                       QUERY PLAN                    
----------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
Merge Join (cost=0.00..475.93 rows=1963 width=44) (actual time=0.066..11.840 rows=2000 loops=1) 
    Merge Cond: (c1.ztimestamp = c6.ztimestamp) 
    -> Merge Join (cost=0.00..371.26 rows=1963 width=56) (actual time=0.052..8.706 rows=2000 loops=1) 
     Merge Cond: (c1.ztimestamp = c5.ztimestamp) 
     -> Merge Join (cost=0.00..291.12 rows=1963 width=48) (actual time=0.042..6.752 rows=2000 loops=1) 
       Merge Cond: (c1.ztimestamp = c4.ztimestamp) 
       -> Merge Join (cost=0.00..210.98 rows=1963 width=40) (actual time=0.033..4.751 rows=2000 loops=1) 
        Merge Cond: (c1.ztimestamp = c3.ztimestamp) 
        -> Merge Join (cost=0.00..130.84 rows=1963 width=32) (actual time=0.022..2.903 rows=2000 loops=1) 
          Merge Cond: (c1.ztimestamp = c2.ztimestamp) 
          -> Index Scan using table_name1_pkey on table_name1 c1 (cost=0.00..50.70 rows=1963 width=24) (actual time=0.009..0.609 rows=2000 loops=1) 
          -> Index Scan using table_name2_pkey on table_name2 c2 (cost=0.00..50.70 rows=1963 width=8) (actual time=0.010..0.756 rows=2000 loops=1) 
        -> Index Scan using table_name3_pkey on table_name3 c3 (cost=0.00..50.70 rows=1963 width=8) (actual time=0.010..0.718 rows=2000 loops=1) 
       -> Index Scan using table_name4_pkey on table_name4 c4 (cost=0.00..50.70 rows=1963 width=8) (actual time=0.009..0.758 rows=2000 loops=1) 
     -> Index Scan using table_name5_pkey on table_name5 c5 (cost=0.00..50.70 rows=1963 width=8) (actual time=0.010..0.696 rows=2000 loops=1) 
    -> Index Scan using table_name6_pkey on table_name6 c6 (cost=0.00..50.70 rows=1963 width=8) (actual time=0.008..1.044 rows=2000 loops=1) 
Total runtime: 70.201 ms 
(17 rows)

UPDATE：大多數人都喜歡在JOIN語法到哪裏...語法：

EXPLAIN ANALYZE 
CREATE TABLE test_table AS 
SELECT c1.ztimestamp, c1.year, c1.month, c1.day, c1.period 
     , c1.the_value + c2.the_value + c3.the_value + c4.the_value 
     + c5.the_value + c6.the_value AS the_value 
FROM table_name1 AS c1 
JOIN table_name2 AS c2 ON c1.ztimestamp=c2.ztimestamp 
JOIN table_name3 AS c3 ON c2.ztimestamp=c3.ztimestamp 
JOIN table_name4 AS c4 ON c3.ztimestamp=c4.ztimestamp 
JOIN table_name5 AS c5 ON c4.ztimestamp=c5.ztimestamp 
JOIN table_name6 AS c6 ON c5.ztimestamp=c6.ztimestamp 
     ;

來源

2012-05-27 15:34:33 wildplasser

Postgresql CREATE TABLE AS具有多個WHERE等號

回答

相關問題