2013-12-14 118 views
2

我在玩Postgresql 9.3的hstore。我正在嘗試使用並索引hstore列just like documentation states。我的問題是該索引似乎沒有被使用。我給大家舉一個例子:如何在Postgresql中使用hstore列的GIST或GIN索引?

我創建了一個表「人」:

=# CREATE TABLE Person (Id BIGSERIAL PRIMARY KEY NOT NULL, Values hstore); 

,並插入測試值:

=# INSERT INTO Person (Values, 'a=>1,b=>3'); 

然後,如果我來解釋一下它使用操作符「SELECT查詢@在 '價值' 列>」,我意料之中得到:

=# EXPLAIN SELECT P.* FROM Person AS P WHERE P.Values @> hstore('a', '1'); 
         QUERY PLAN       
---------------------------------------------------------- 
Seq Scan on person p (cost=0.00..24.50 rows=1 width=40) 
    Filter: ("values" @> '"a"=>"1"'::hstore) 

沒有索引< - >順序掃描。說得通。無論如何,如果我創建一個GIN或GIST指數也沒關係,該解釋一直談論順序掃描:

=# CREATE INDEX IX_GIN_VALUES ON Person USING GIN (values); 
CREATE INDEX 

=# EXPLAIN SELECT P.* FROM Person P WHERE P.values @> hstore('a', '1'); 

         QUERY PLAN       
---------------------------------------------------------- 
Seq Scan on person p (cost=0.00..1.01 rows=1 width=246) 
    Filter: ("values" @> '"age"=>"2"'::hstore) 

也許我失去了一些東西明顯?

回答

5

如果您只是在玩它,請確保爲索引掃描添加足夠的數據以使其有意義。如果只有幾行,或者許多行包含相似的值(即,您的where條件沒有足夠的選擇性),則seq掃描通常會比索引掃描更快。

此外,請在填寫完測試數據後將analyze表格填入表格中。


一些額外的閱讀爲@maxm:

Performance has greatly improved因爲後者寫。)

爲什麼不是他/她的索引被使用?

因爲它是Postgres的快序列掃描整個表(其中有一個單排)和行過濾掉單個磁盤頁面的,比它是做索引查找,然後以次掃描表同樣爲了檢索行的數據。

是否存在提問者如何創建其索引的問題?

沒有,但看到上面的鏈接時,最好使用規範化的數據。

寧願json or jsonb而不是hstore。

查詢hstore列?需要修改哪些內容才能使SELECT查詢使用這樣的索引?

什麼也沒有,但再次看到上面的鏈接什麼時候最好使用規範化的數據。

+0

@maxm:這是否足夠您的信息? –

+1

爲什麼喜歡json或jsonb而不是hstore?我會去了解他們,但你有什麼理由? – maxm

+0

@maxm:它更加靈活,並且可以按照javascript中的原樣使用序列化數據。 –

1

簡而言之:當表格中的頁面很少時,Postgres的規劃人員更喜歡跳過索引並加載和掃描行。

CREATE SCHEMA stackoverflow20589058; 
--- CREATE SCHEMA 

SET search_path TO stackoverflow20589058,"$user",public; 
--- SET 

CREATE EXTENSION hstore; 
--- CREATE EXTENSION 

CREATE TABLE Person (Id BIGSERIAL PRIMARY KEY NOT NULL, Values hstore); 
--- CREATE TABLE 

WITH Vals(n) AS (SELECT * FROM generate_series(1,10)) 
INSERT INTO Person (
    SELECT n AS Id, hstore('a=>'||n||', b=>'||n) AS Values FROM Vals 
); 
--- INSERT 0 10 

EXPLAIN SELECT P.* FROM Person AS P WHERE P.Values @> hstore('a', '1'); 
---       QUERY PLAN       
--- ---------------------------------------------------------- 
--- Seq Scan on person p (cost=0.00..24.50 rows=1 width=40) 
--- Filter: ("values" @> '"a"=>"1"'::hstore) 
--- (2 rows) 

CREATE INDEX IX_GIN_VALUES ON Person USING GIN (values); 
--- CREATE INDEX 

------------------------- When there are few values, a sequential scan is 
------------------------- often the best search strategy. Grabbing a few 
------------------------- pages in sequence can be cheaper than making an 
------------------------- extra disk seek to load the index. 
EXPLAIN SELECT P.* FROM Person AS P WHERE P.Values @> hstore('a', '1'); 
---      QUERY PLAN       
--- --------------------------------------------------------- 
--- Seq Scan on person p (cost=0.00..1.12 rows=1 width=40) 
--- Filter: ("values" @> '"a"=>"1"'::hstore) 
--- (2 rows) 

TRUNCATE Person; 
--- TRUNCATE TABLE 

WITH Vals(n) AS (SELECT * FROM generate_series(1,100000)) 
INSERT INTO Person (
    SELECT n AS Id, hstore('a=>'||n||', b=>'||n) AS Values FROM Vals 
); 
--- INSERT 0 100000 

------------------------- When there are many rows, using the index can 
------------------------- allow us to skip quite a lot of I/O; so 
------------------------- Postgres's planner makes use of the index. 
EXPLAIN SELECT P.* FROM Person AS P WHERE P.Values @> hstore('a', '1'); 
---         QUERY PLAN         
--- -------------------------------------------------------------------------------- 
--- Bitmap Heap Scan on person p (cost=916.83..1224.56 rows=107 width=40) 
--- Recheck Cond: ("values" @> '"a"=>"1"'::hstore) 
--- -> Bitmap Index Scan on ix_gin_values (cost=0.00..916.80 rows=107 width=0) 
---   Index Cond: ("values" @> '"a"=>"1"'::hstore) 
--- (4 rows) 

DROP SCHEMA stackoverflow20589058 CASCADE; 
--- NOTICE: drop cascades to 2 other objects 
--- DETAIL: drop cascades to extension hstore 
--- drop cascades to table person 
--- DROP SCHEMA