在couchbase上查詢的執行時間太長

我是新的couchbase，我正在使用N1QL做一些查詢，但它需要很長時間（9分鐘）我的數據有200.000個文檔，文檔具有嵌套類型，文檔中嵌套類型的數量是在200.000個文檔之間分配的6.000.000，所以UNNEST操作很重要。我的數據的樣品是：在couchbase上查詢的執行時間太長

我做

{"p_partkey": 2, "lineorder": [{"customer": [{"c_city": "INDONESIA1"}], "lo_supplycost": 54120, "orderdate": [{"d_weeknuminyear": 19}], "supplier": [{"s_phone": "16-789-973-6601|"}], "commitdate": [{"d_year": 1993}], "lo_tax": 7}, {"customer": [{...

一個查詢爲：

SELECT SUM(l.lo_extendedprice*l.lo_discount*0.01) as revenue 
from part p UNNEST p.lineorder l UNNEST l.orderdate o 
where o.d_year=1993 and l.lo_discount between 1 and 3 and l.lo_quantity<25;

數據有上面提到的領域。但它需要9分鐘才能執行。我只用我的電腦來做，所以只有一個節點。我的電腦有16GB的內存，而集羣RAM Cota是3.2GB，只有一個3GB的存儲桶。我的數據總大小爲2,45GB。我已經使用這裏提到的計算：http://docs.couchbase.com/admin/admin/Concepts/bp-sizingGuidelines.html來確定我的羣集和存儲區的大小。我做錯了什麼，或者這一次是正確的這個數據量？

現在我已經創建了索引，如：

CREATE INDEX idx_discount ON part(DISTINCT ARRAY l.lo_discount FOR l IN lineorder END); 

CREATE INDEX idx_quantity ON part(DISTINCT ARRAY l.lo_quantity FOR l IN lineorder END); 

CREATE INDEX idx_year ON part(DISTINCT ARRAY o.d_year FOR o IN (DISTINCT ARRAY l.orderdate FOR l IN lineorder END) END);

但數據庫不使用它。

一個查詢的例子是：

SELECT SUM(l.lo_extendedprice*l.lo_discount*0.01) as revenue 
from part p UNNEST p.lineorder l UNNEST l.orderdate o 
where o.d_year=1993 and l.lo_discount between 1 and 3 and l.lo_quantity<25;

又如，我已創建的索引：

CREATE INDEX teste3 ON `part` (DISTINCT ARRAY l.lo_quantity FOR l IN lineorder END);

和查詢：

select l.lo_quantity from part as p UNNEST p.lineorder l where l.lo_quantity>20 limit 3

因爲我已刪除主索引，它不執行。返回錯誤：「沒有關鍵空間部分的主索引，使用CREATE PRIMARY INDEX創建一個。」，

來源

2016-05-11 Raphael

您可以使用Couchbase 4.5（GA即將推出）和數組索引。數組索引可以與UNNEST一起使用。它允許您索引數組的各個元素，包括嵌套在其他數組中的數組。

您可以創建以下索引，然後使用EXPLAIN確保使用您的預期索引有IndexScan。

CREATE INDEX idx_discount ON part(DISTINCT ARRAY l.lo_discount FOR l IN lineorder END); 

CREATE INDEX idx_quantity ON part(DISTINCT ARRAY l.lo_quantity FOR l IN lineorder END); 

CREATE INDEX idx_year ON part(DISTINCT ARRAY (DISTINCT ARRAY o.d_year FOR o IN l.orderdate END) FOR l IN lineorder END);

來源

2016-05-11 03:49:45 geraldss

嗨@geraldss，我已經在使用4.5。我的意圖是使用索引，因爲生病做了不同的查詢。 Colud你告訴我，如果我已經正確配置了我的couchbase，並且如果不使用索引，那麼另一種方式可以獲得更好的性能？謝謝你的幫助。 – Raphael

嗨@Raphael，即使你有很多查詢，你也需要使用索引。 Couchbase允許您創建多個索引。 – geraldss

好的@geraldss，非常感謝。 – Raphael

閱讀的博客後：http://blog.couchbase.com/2016/may/1.making-most-of-your-arrays..-with-covering-array-indexes-and-more我discovedered問題：

如果你創建這樣的INDEX：

CREATE INDEX iflight_day 
     ON `travel-sample` (DISTINCT ARRAY v.flight FOR v IN schedule END);

你必須使用相同的字母的查詢，在這種情況下字母'v'。

SELECT v.day from `travel-sample` as t UNNEST t.schedule v where v.flight="LY104";

同樣是最深層次的情況：

CREATE INDEX inested ON `travel-sample` 
(DISTINCT ARRAY (DISTINCT ARRAY y.flight FOR y IN x.special_flights END) FOR x IN schedule END);

在這種情況下，你必須使用 'Y' 和 'X'：

SELECT x.day from `travel-sample` as t UNNEST t.schedule x UNNEST x.special_flights y where y.flight="AI444";

現在每一件事工作精細。

但另一個問題出現了，當我質疑這樣的：

SELECT * from `travel-sample` as t UNNEST t.schedule x UNNEST x.special_flights y 
where x.day=7 and y.flight="AI444";

只有一天像索引創建上面使用。

CREATE INDEX day 
      ON `travel-sample` (DISTINCT ARRAY y.day FOR y IN schedule END);

它只使用一個索引，有時是'日'，有時'inested'。

來源

2016-08-12 22:09:52 Raphael

變量必須在UNNEST和數組索引之間匹配。試試這個：SELECT s.day FROM \'travel-sample \'AS t UNNEST t.schedule AS v WHERE v.flight =「LY146」; – geraldss

@geraldss，我剛剛安裝了最新版本的企業版。我的查詢就像你的，你只是改變每v的s，但它不是使用索引iflight_day，只有當我這樣查詢：SELECT s.day from（SELECT schedule FROM travel-sample WHERE ANY v IN schedule SATISFIES v.flight = 「LY146」END）as t UNNEST t.schedule s where s.flight =「LY146」; – Raphael

不知道爲什麼發生這種情況。請嘗試即將到來的4.5.1。他們都爲我工作。 – geraldss

在couchbase上查詢的執行時間太長

回答

相關問題