Cassandra UTF8Type按鍵的順序是什麼？（Cassandra 2.0）

Cassandra UTF8Type的排序是什麼？Cassandra UTF8Type按鍵的順序是什麼？（Cassandra 2.0）

所有的文檔都讓我期待一個詞法排序順序（基本上是按字母排序）。這似乎不是卡桑德拉使用的命令。它是什麼是使用我很難猜測。

我構建了一個表來計算影響名爲「應用程序」的交互，按照一天的時間段進行組織。（這是一個簡單的例子來證明我的困惑的原因）。我希望能夠尋找一個特定的應用表的CQL描述如下：

 
CREATE TABLE "appMetrics" (app text,time timestamp,counter_val counter, 
    PRIMARY KEY (app, time)) WITH COMPACT STORAGE;

我的數據加載：

 
update "appMetrics" set counter_val = counter_val+1 WHERE app='ab' AND time='2014-02-14 00:00:00'; 
update "appMetrics" set counter_val = counter_val+1 WHERE app='a' AND time='2014-02-14 00:00:00'; 
update "appMetrics" set counter_val = counter_val+1 WHERE app='c' AND time='2014-02-14 00:00:00'; 
update "appMetrics" set counter_val = counter_val+1 WHERE app='b' AND time='2014-02-14 00:00:00'; 
update "appMetrics" set counter_val = counter_val+1 WHERE app='bc' AND time='2014-02-14 00:00:00'; 
update "appMetrics" set counter_val = counter_val+1 WHERE app='ca' AND time='2014-02-14 00:00:00';

我從表中選擇，看看這個結果是：

 
    select * from "appMetrics"; 

    app | time      | counter_val 
    -----+--------------------------+------------- 
     a | 2014-02-14 00:00:00-0500 |   1 
     c | 2014-02-14 00:00:00-0500 |   1 
     ab | 2014-02-14 00:00:00-0500 |   1 
     ca | 2014-02-14 00:00:00-0500 |   1 
     bc | 2014-02-14 00:00:00-0500 |   1 
     b | 2014-02-14 00:00:00-0500 |   1 

    (6 rows)

所以，這個命令不是字母的，不是輸入順序，也不是我能看到的任何順序。順序是不是隨機的，或者至少是重複的：

cqlsh:simplex> select * from "appMetrics" where token(app) >= token('ab'); 

app | time      | counter_val 
-----+--------------------------+------------- 
    ab | 2014-02-14 00:00:00-0500 |   1 
    ca | 2014-02-14 00:00:00-0500 |   1 
    bc | 2014-02-14 00:00:00-0500 |   1 
    b | 2014-02-14 00:00:00-0500 |   1 

(4 rows) 

cqlsh:simplex> select * from "appMetrics" where token(app) <= token('ab'); 

app | time      | counter_val 
-----+--------------------------+------------- 
    a | 2014-02-14 00:00:00-0500 |   1 
    c | 2014-02-14 00:00:00-0500 |   1 
    ab | 2014-02-14 00:00:00-0500 |   1 

(3 rows)

對於它的價值，列家族描述爲：

 
    ColumnFamily: appMetrics 
     Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type 
     Default column value validator: org.apache.cassandra.db.marshal.CounterColumnType 
     Cells sorted by: org.apache.cassandra.db.marshal.TimestampType 
     GC grace seconds: 864000 
     Compaction min/max thresholds: 4/32 
     Read repair chance: 0.1 
     DC Local Read repair chance: 0.0 
     Populate IO Cache on flush: false 
     Replicate on write: true 
     Caching: KEYS_ONLY 
     Default time to live: 0 
     Bloom Filter FP chance: 0.01 
     Index interval: 128 
     Speculative Retry: 99.0PERCENTILE 
     Built indexes: [] 
     Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy 
     Compression Options: 
     sstable_compression: org.apache.cassandra.io.compress.LZ4Compressor

有人能解釋這些如何排序？

來源

2014-02-13 BeauGust

好吧，我想我現在知道這個問題的答案。因爲密鑰（分區密鑰）是密鑰的標記化表示，所以答案是行（分區）按標記的順序存儲。

作爲示例，對於上面顯示的同一張表，我請求了密鑰的標記值，並獲得了該值。

 
cqlsh:simplex> select token(app), app from "appMetrics"; 

token(app)   | app 
----------------------+----- 
-8839064797231613815 | a 
-8198557465434950441 | c 
-7815133031266706642 | ab 
    -633243080167210587 | ca 
    4832945267908438539 | bc 
    8833996863197925870 | b 

(6 rows)

更多信息：這是因爲我使用了默認的Murmur3Partitioner。我可以通過使用ByteOrderPartitioner按字母順序（我認爲）獲取內容。不幸的是，這是在集羣層面設置的，因此它控制着整個集羣。 Datastax不推薦使用ByteOrderPartitioner（http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architecturePlanningAntiPatterns_c.html）。

來源

2014-02-14 16:52:05 BeauGust

Cassandra UTF8Type按鍵的順序是什麼？ （Cassandra 2.0）

回答

相關問題

Cassandra UTF8Type按鍵的順序是什麼？（Cassandra 2.0）