我是Cassandra的新成員,並且正在測試它是否有寫入負載,但是存在Cassandra穩定性問題。首先,一點點有關環境的信息:Cassandra節點在重寫時失敗
- 的Windows(在PC上測試了Windows 7和8,以及Server 2008 R2和Server 2012中)
- 使用Java 7ü45(最新可用寫這個問題)
- 卡桑德拉1.2.10
- 與卡桑德拉C#1.01驅動器訪問卡桑德拉的時間
- 問題時不管簇的大小在一個簇(從1個節點測試多達6個節點) 。
- 數據磁盤是SSD。
我正在編寫的應用程序將處理極其龐大的信息數據集,需要Cassandra已知的高寫入(和讀取)功能。
的實施例
我使用的示例創建四個字段 「測試」 對象:ID(GUID),名稱(文本),Insert_User(文本),並Insert_TimeStamp(時間戳)。該代碼只是試圖創建50000批次的100萬條記錄。通常,寫入過程在達到150,000到200,000條記錄時失敗。大多數情況下,會發生寫入超時異常(超時設置爲20秒)。有時,堆溢出,但我似乎通過調整cassandra.yaml文件緩存並刷新設置以及擴大與下面的Java配置設置堆已經解決了這個問題:
-Xmx20G -Xms1G -Xss256K
的代碼我現在用的就是這裏:
using System;
using System.Collections.Generic;
using System.ComponentModel.DataAnnotations;
using System.Diagnostics;
using System.Linq;
using System.Linq.Expressions;
using System.Text;
using System.Threading.Tasks;
using Cassandra;
using Cassandra.Data.Linq;
namespace TestLinq3
{
class Program
{
static void Main(string[] args)
{
Console.WindowWidth = 160;
Stopwatch sw = new Stopwatch();
sw.Start();
Console.WriteLine("Started at " + System.DateTime.Now);
var cluster = Cluster.Builder()
.AddContactPoint("127.0.0.1")
.WithCredentials("cassandra", "cassandra")
.Build();
Metadata metadata = cluster.Metadata;
Console.WriteLine("Starting process...");
var session = cluster.Connect();
session.CreateKeyspaceIfNotExists("test");
session.ChangeKeyspace("test");
Console.WriteLine("Connected to keyspace...");
var table = session.GetTable<Test>();
table.CreateIfNotExists();
List<Test> testlist = new List<Test>();
for (int j = 0; j < 20; j++)
{
Console.WriteLine("Running j loop " + j.ToString());
var batch = session.CreateBatch();
for (int i = 0; i < 50000; i++)
{
testlist.Add(new Test { id = System.Guid.NewGuid(), name = "Name " + i, insertUser = "cassandra", insertTimeStamp = System.DateTimeOffset.UtcNow });
}
batch.Append(from t in testlist select table.Insert(t));
try
{
batch.Execute();
//Flush();
}
catch (WriteTimeoutException ex)
{
Console.WriteLine("WriteTimeoutException hit. Waiting 20 seconds...");
Console.WriteLine(ex.StackTrace);
System.Threading.Thread.Sleep(60000);
}
batch = null;
Console.WriteLine("Time elapsed since start is " + sw.Elapsed.Hours.ToString("00")+":"+sw.Elapsed.Minutes.ToString("00")+":"+sw.Elapsed.Seconds.ToString("00"));
}
var results = (from rows in table where rows.name == "Name 333" select rows).Execute().Count();
Console.WriteLine(results);
sw.Stop();
Console.WriteLine("Processing time was " + sw.Elapsed.Hours.ToString("00") + ":" + sw.Elapsed.Minutes.ToString("00") + ":" + sw.Elapsed.Seconds.ToString("00") + ":" + sw.Elapsed.Milliseconds.ToString("00") + ".");
Console.ReadLine();
}
[AllowFiltering]
[Table("test")]
public class Test
{
[PartitionKey]
[Column("id")]
public Guid id;
[SecondaryIndex]
[Column("name")]
public string name;
[SecondaryIndex]
[Column("insert_user")]
public string insertUser;
[SecondaryIndex]
[Column("insert_timestamp")]
public DateTimeOffset insertTimeStamp;
}
}
}
的Cassandra.yaml設置我已經在這裏(註釋爲了節省空間,刪除):
#Cassandra storage config YAML cluster_name: 'DEV' initial_token: max_hint_window_in_ms: 10800000 # 3 hours hinted_handoff_throttle_in_kb: 1024 max_hints_delivery_threads: 2 authenticator: PasswordAuthenticator authorizer: CassandraAuthorizer permissions_validity_in_ms: 2000 partitioner: org.apache.cassandra.dht.Murmur3Partitioner data_file_directories: - /var/lib/cassandra/data commitlog_directory: /var/lib/cassandra/commitlog disk_failure_policy: stop key_cache_size_in_mb: key_cache_save_period: 14400 row_cache_size_in_mb: 0 row_cache_save_period: 0 row_cache_provider: SerializingCacheProvider saved_caches_directory: /var/lib/cassandra/saved_caches commitlog_sync: periodic commitlog_sync_period_in_ms: 10000 commitlog_segment_size_in_mb: 32 seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider - seeds: "127.0.0.1" flush_largest_memtables_at: 0.50 reduce_cache_sizes_at: 0.50 reduce_cache_capacity_to: 0.30 concurrent_reads: 32 concurrent_writes: 32 memtable_total_space_in_mb: 4096 memtable_flush_writers: 8 memtable_flush_queue_size: 4 trickle_fsync: false trickle_fsync_interval_in_kb: 10240 storage_port: 7000 ssl_storage_port: 7001 listen_address: localhost start_native_transport: true native_transport_port: 9042 start_rpc: true rpc_address: localhost rpc_port: 9160 rpc_keepalive: true rpc_server_type: sync thrift_framed_transport_size_in_mb: 15 incremental_backups: false snapshot_before_compaction: false auto_snapshot: true column_index_size_in_kb: 64 in_memory_compaction_limit_in_mb: 64 multithreaded_compaction: false compaction_throughput_mb_per_sec: 16 compaction_preheat_key_cache: true read_request_timeout_in_ms: 10000 range_request_timeout_in_ms: 10000 write_request_timeout_in_ms: 10000 truncate_request_timeout_in_ms: 60000 request_timeout_in_ms: 10000 cross_node_timeout: false endpoint_snitch: SimpleSnitch dynamic_snitch_update_interval_in_ms: 100 dynamic_snitch_reset_interval_in_ms: 600000 dynamic_snitch_badness_threshold: 0.1 request_scheduler: org.apache.cassandra.scheduler.NoScheduler index_interval: 128 server_encryption_options: internode_encryption: none keystore: conf/.keystore keystore_password: cassandra truststore: conf/.truststore truststore_password: cassandra client_encryption_options: enabled: false keystore: conf/.keystore keystore_password: cassandra internode_compression: all inter_dc_tcp_nodelay: true
這一努力的目標即使要求每個節點的寫入速度較慢(當前寫入速度大約爲每秒7,000條記錄),Cassandra節點仍然是穩定的。在互聯網上搜索StackOverflow和其他位置之後,我還沒有找到解決此問題的正確解決方法,並且希望能夠在高寫入量環境中獲得Cassandra經驗的任何反饋。
最佳,
湯姆
我嘗試了一些memtable和緩存大小的不同調整。仍然沒有運氣。問題是我寫的應用程序每天晚上需要大量的寫入。 –