我們有Cassandra的6個節點的簇,3個種子。有一天,AWS向我們發送了一條消息,表明我們的一個實例將被退役,並且這是seed01。要解決這個問題,我們應該簡單地停止/啓動實例,將其移至新的AWS主機。前需要停止/開始我們做的:
2)停止八卦
3)停止節儉
4)排水
5)停止卡桑德拉 6)將所有的數據到EBS(我們利用短暫的卷用於數據)
7)停止/啓動實例
8)移動數據回
9)開始卡桑德拉
Cassandra 1.2:新節點不想連接環
但在seed01 nodetool狀態開始卡桑德拉後表示:
Datacenter: UNKNOWN-DC
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
DN 10.149.45.115 ? 256 17.3% ae4166fb-76e1-4900-947c-7e87ca262ea0 UNKNOWN-RACK
DN 10.164.84.171 ? 256 17.5% 638dae19-a6f5-4330-9466-f46ddb3b9d79 UNKNOWN-RACK
DN 10.149.44.215 ? 256 16.2% 987914af-f057-4922-8ee1-2a999108c75d UNKNOWN-RACK
DN 10.232.20.72 ? 256 14.8% fb5dfd50-de9e-42ed-b539-bd937a045992 UNKNOWN-RACK
DN 10.166.37.188 ? 256 17.1% f149c294-ca1d-427c-b510-2f91a0966b5a UNKNOWN-RACK
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.232.17.19 1020.87 MB 256 17.1% 08055af6-5dfa-4d4e-aa72-cf1d2952e23e 1b
我們也嘗試在配置中啓動seed04與seed02和seed03作爲種子,但它創建新環而不是加入現有環。
我們在所有節點上檢查了端口7000,並且該端口對於所有節點都是可到達的。默認情況下,我們爲所有節點所在的相同安全組打開所有端口(TCP/UDP 0-65535)。 在tcpdump的我看到,它會嘗試連接到種子:
08:43:42.056115 IP 10.235.62.198.45163 > 10.164.84.171.7000: Flags [P.], seq 0:8, ack 1, win 46, options [nop,nop,TS val 81748069 ecr 538805526], length 8
08:43:42.056146 IP 10.164.84.171.7000 > 10.235.62.198.45163: Flags [R], seq 110766787, win 0, length 0
08:43:42.157893 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [S], seq 452519826, win 5840, options [mss 1460,sackOK,TS val 81748094 ecr 0,nop,wscale 7], length 0
08:43:42.157903 IP 10.164.84.171.7000 > 10.235.62.198.45165: Flags [S.], seq 4035182025, ack 452519827, win 5792, options [mss 1460,sackOK,TS val 538833931 ecr 81748094,nop,wscale 7], length 0
08:43:42.158920 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [.], ack 1, win 46, options [nop,nop,TS val 81748094 ecr 538833931], length 0
08:43:42.159053 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [P.], seq 1:9, ack 1, win 46, options [nop,nop,TS val 81748094 ecr 538833931], length 8
08:43:42.360086 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [P.], seq 1:9, ack 1, win 46, options [nop,nop,TS val 81748145 ecr 538833931], length 8
08:43:42.768080 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [P.], seq 1:9, ack 1, win 46, options [nop,nop,TS val 81748247 ecr 538833931], length 8
08:43:43.584072 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [P.], seq 1:9, ack 1, win 46, options [nop,nop,TS val 81748451 ecr 538833931], length 8
08:43:45.216087 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [P.], seq 1:9, ack 1, win 46, options [nop,nop,TS val 81748859 ecr 538833931], length 8
08:43:45.783333 IP 10.164.84.171.7000 > 10.235.62.198.45165: Flags [S.], seq 4035182025, ack 452519827, win 5792, options [mss 1460,sackOK,TS val 538834838 ecr 81748859,nop,wscale 7], length 0
08:43:45.784337 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [.], ack 1, win 46, options [nop,nop,TS val 81749001 ecr 538834838,nop,nop,sack 1 {0:1}], length 0
其中10.235.62.198新節點並10.164.84.171是種子
我們使用Cassandra的1.2.6版本與虛擬節點。
請幫忙。我們花了近3天的時間試圖修復它,但沒有運氣。