greenplum分區優化

在greenplum上，我有一個名爲fact_table的大表，它由RANGE(day_bucket)分區。爲什麼是下面的查詢這麼慢：greenplum分區優化

select max(day_bucket) from fact_table where day_bucket >= '2011-09-11 00:00:00' and day_bucket < '2011-12-14'.

我想這應該只是看每個分區的頭部，並立即返回結果，因爲同樣的day_bucket列的每個分區。但是greenplum進行了全面掃描來計算結果。任何人都可以向我解釋原因？

更新：

謝謝你回答我的問題，但它不與你的小費幫助。 Greenplum的總是做一個完整的掃描，即使我創建表PARTITION BY LIST（day_bucket）：

 
CREATE TABLE fact_table (
    id character varying(25) NOT NULL, 
    day_bucket timestamp without time zone NOT NULL, 
) 
WITH (appendonly=true, orientation=column, compresstype=zlib, compresslevel=6) DISTRIBUTED BY (user_id) PARTITION BY LIST(day_bucket) 
      (
      PARTITION p20120101 VALUES ('2012-01-01 00:00:00'::timestamp without time zone) WITH (tablename='fact_table_1_prt_p20120101', appendonly=true, orientation=column, compresstype=zlib, compresslevel=6), 
      PARTITION p20120102 VALUES ('2012-01-02 00:00:00'::timestamp without time zone) WITH (tablename='fact_table_1_prt_p20120102', appendonly=true, orientation=column, compresstype=zlib, compresslevel=6), 
      PARTITION p20120103 VALUES ('2012-01-03 00:00:00'::timestamp without time zone) WITH (tablename='fact_table_1_prt_p20120103', appendonly=true, orientation=column, compresstype=zlib, compresslevel=6), 
      PARTITION p20120104 VALUES ('2012-01-04 00:00:00'::timestamp without time zone) WITH (tablename='fact_table_1_prt_p20120104', appendonly=true, orientation=column, compresstype=zlib, compresslevel=6), 
     .....

解釋命令表明，它總是做一個完整的掃描：

- >附加，只柱狀掃描on mytestlist_1_prt_p20120102 mytestlist（成本= 0.00..34.95行= 1寬度= 8）過濾器：day_bucket> ='2012-01-02 00:00:00'::不帶時區和day_bucket的時間戳僅追加列式掃描mytestlist_1_prt_p20120103 mytestlist（cost = 0.00..39.61 rows = 1 width = 8） Filter：day_bucket> ='2012-01-02 00:00:00'::不帶時區的時間戳和day_bucket

來源

2011-12-15 vim

是否有表上的所有索引？每個分區平均有多少行？ – 2011-12-15 18:48:24

將來請編輯您的問題以添加更多信息。而且，我合併了您似乎意外創建的額外帳戶。 – jjnguy

您使用的是哪種版本的GPDB？你能發佈一個完整的解釋計劃嗎？ –

您應該注意適用於您的分區的約束條件。要允許優化器正確排除掃描中的某些分區，您應該幫助他。在你的情況，你應該使用顯式類型轉換：（GP無法在規劃階段自動理解爲像'YYYY-MM-DD刺其實是時間戳）

select max(day_bucket) 
from fact_table 
where day_bucket >= '2011-09-11 00:00:00'::timestamp 
    and day_bucket < '2011-12-14'::timestamp

來源

2011-12-16 09:03:07

greenplum分區優化

回答

相關問題