哪個列族方法在cassandra中是可取的？

我想創建一個卡桑德拉列族與外地（time_partition，鑰匙，PERIOD_TIME和period_value存儲時間序列數據。
我使用time_partition用於快速查詢。 time_partition = PERIOD_TIME /（的EPOCH WEEK）;哪個列族方法在cassandra中是可取的？

哪一列家庭創造更好的（我有大量的數據太）

方法1

CREATE TABLE tablename 
(
    time_partition text, 
    key text, 
    period_time text, 
    period_value text, 
    PRIMARY KEY (time_partition,key, period_time) 
);

方法2

CREATE TABLE tablename 
(
    key text, 
    time_partition text, 
    period_time text, 
    period_value text, 
    PRIMARY KEY (key,time_partition, period_time) 
);

2種方法之間的區別在於主鍵的順序。

來源

2014-03-04 kjk

還有是有區別。您的主鍵定義了您的數據將如何物理存儲，以及如何查詢它的方式。

分區鍵（主鍵中的第1項）將定義將數據存儲在哪個節點上的節點。

以下查詢對第一種情況有效（對於=旁邊的非分區鍵也可以使用其他關係）。

select * from tablename where time_partition = <val>; 
select * from tablename where time_partition = <val> and key = <val>; 
select * from tablename where time_partition = <val> and key = <val> and period_time = <val>;

對於第二種情況的有效的查詢將

select * from tablename where key = <val>; 
select * from tablename where key = <val> and time_partition = <val>; 
select * from tablename where key = <val> and time_partition = <val> and period_time = <val>;

不能爲第1模式和運行select * from tablename where time_partition = <val>;爲select * from tablename where key = <val>;第二

根據您的查詢的表，以便進行建模。

來源

2014-03-04 18:01:18

我只想這樣查詢：（對於第一種方法）select * from tablename where time_partition = and key = and period_time = ;或（對於第二種方法）select * from tablename where key = and time_partition = and period_time = ; 在獲取數據方面速度都一樣嗎？ – kjk

分區的大小也會影響讀取性能。所以如果''鍵的數量很少，那麼你最終會得到大的分區。您可以使用複合分區鍵「PRIMARY KEY（（key，time_partition），period_time）」，因此每個物理分區只會存儲「period_time」值。但是你不能通過''key''和''time_partition''來命令它 –

絕對沒有區別。訂單不影響查詢速度。

來源

2014-03-04 15:09:20

如果「關鍵」字段的數量較少，「時間間隔」較大，該怎麼辦？ – kjk

這一切都取決於你正在做什麼樣的查詢。如果您通過組合鍵查詢，順序根本無關緊要。 –

哪個列族方法在cassandra中是可取的？

回答

相關問題