2

我有3個節點(nodes0,node1,node2)具有複製因子2的Kafka集羣(broker0,broker1,broker2)和Zookeeper(使用與Kafka tar一起打包的zookeeper)一個不同的節點(節點4)。使用__consumer_offsets殺死節點導致消費者沒有消息消耗

我在啓動zookeper和其他節點後啓動了broker 0。它被認爲是在券商0日誌正在讀取__consumer_offsets,似乎它們存儲在經紀人0。下面是抽樣日誌:

卡夫卡版本:kafka_2.10-0.10.2.0

2017-06-30 10:50:47,381] INFO [GroupCoordinator 0]: Loading group metadata for console-consumer-85124 with generation 2 (kafka.coordinator.GroupCoordinator) 
    [2017-06-30 10:50:47,382] INFO [Group Metadata Manager on Broker 0]: Finished loading offsets from __consumer_offsets-41 in 23 milliseconds. (kafka.coordinator.GroupMetadataManager) 
    [2017-06-30 10:50:47,382] INFO [Group Metadata Manager on Broker 0]: Loading offsets and group metadata from __consumer_offsets-44 (kafka.coordinator.GroupMetadataManager) 
    [2017-06-30 10:50:47,387] INFO [Group Metadata Manager on Broker 0]: Finished loading offsets from __consumer_offsets-44 in 5 milliseconds. (kafka.coordinator.GroupMetadataManager) 
    [2017-06-30 10:50:47,387] INFO [Group Metadata Manager on Broker 0]: Loading offsets and group metadata from __consumer_offsets-47 (kafka.coordinator.GroupMetadataManager) 
    [2017-06-30 10:50:47,398] INFO [Group Metadata Manager on Broker 0]: Finished loading offsets from __consumer_offsets-47 in 11 milliseconds. (kafka.coordinator.GroupMetadataManager) 
    [2017-06-30 10:50:47,398] INFO [Group Metadata Manager on Broker 0]: Loading offsets and group metadata from __consumer_offsets-1 (kafka.coordinator.GroupMetadataManager) 

而且,我可以在同一個代理0日誌中看到GroupCoordinator消息。

[2017-06-30 14:35:22,874] INFO [GroupCoordinator 0]: Preparing to restabilize group console-consumer-34472 with old generation 1 (kafka.coordinator.GroupCoordinator) 
    [2017-06-30 14:35:22,877] INFO [GroupCoordinator 0]: Group console-consumer-34472 with generation 2 is now empty (kafka.coordinator.GroupCoordinator) 
    [2017-06-30 14:35:25,946] INFO [GroupCoordinator 0]: Preparing to restabilize group console-consumer-6612 with old generation 1 (kafka.coordinator.GroupCoordinator) 
    [2017-06-30 14:35:25,946] INFO [GroupCoordinator 0]: Group console-consumer-6612 with generation 2 is now empty (kafka.coordinator.GroupCoordinator) 
    [2017-06-30 14:35:38,326] INFO [GroupCoordinator 0]: Preparing to restabilize group console-consumer-30165 with old generation 1 (kafka.coordinator.GroupCoordinator) 
    [2017-06-30 14:35:38,326] INFO [GroupCoordinator 0]: Group console-consumer-30165 with generation 2 is now empty (kafka.coordinator.GroupCoordinator) 
    [2017-06-30 14:43:15,656] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 3 milliseconds. (kafka.coordinator.GroupMetadataManager) 
    [2017-06-30 14:53:15,653] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager) 

雖然測試容錯使用的kafka-console-consumer.sh和kafka-console-producer.sh集羣,我看到,殺害1經紀人或經紀人2,消費者可仍然收到來自制作人的新消息。重新平衡正在發生。

但是,殺死經紀人0不會導致任何消費者的新消息或舊消息消費。 以下是經紀人0死亡前後的話題狀態。

以前

Topic:test-topic PartitionCount:3 ReplicationFactor:2 Configs: 
    Topic: test-topic Partition: 0 Leader: 2 Replicas: 2,0 Isr: 0,2 
    Topic: test-topic Partition: 1 Leader: 0 Replicas: 0,1 Isr: 0,1 
    Topic: test-topic Partition: 2 Leader: 1 Replicas: 1,2 Isr: 1,2 

Topic:test-topic PartitionCount:3 ReplicationFactor:2 Configs: 
    Topic: test-topic Partition: 0 Leader: 2 Replicas: 2,0 Isr: 2 
    Topic: test-topic Partition: 1 Leader: 1 Replicas: 0,1 Isr: 1 
    Topic: test-topic Partition: 2 Leader: 1 Replicas: 1,2 Isr: 1,2 

以下是出現在消費者登錄經紀人0被殺害後,WARN消息

[2017-06-30 14:19:17,155] WARN Auto-commit of offsets {test-topic-2=OffsetAndMetadata{offset=4, metadata=''}, test-topic-0=OffsetAndMetadata{offset=5, metadata=''}, test-topic-1=OffsetAndMetadata{offset=4, metadata=''}} failed for group console-consumer-34472: Offset commit failed with a retriable exception. You should retry committing offsets. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) 
[2017-06-30 14:19:10,542] WARN Auto-commit of offsets {test-topic-2=OffsetAndMetadata{offset=4, metadata=''}, test-topic-0=OffsetAndMetadata{offset=5, metadata=''}, test-topic-1=OffsetAndMetadata{offset=4, metadata=''}} failed for group console-consumer-30165: Offset commit failed with a retriable exception. You should retry committing offsets. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) 

經紀人屬性。其餘的默認屬性保持不變。

broker.id=0 
delete.topic.enable=true 

auto.create.topics.enable=false 
listeners=PLAINTEXT://XXX:9092 
advertised.listeners=PLAINTEXT://XXX:9092 
log.dirs=/tmp/kafka-logs-test1 
num.partitions=3 
zookeeper.connect=XXX:2181 

監製性質。其餘的默認屬性保持不變。

bootstrap.servers=XXX,XXX,XXX 
compression.type=snappy 

消費屬性。其餘的默認屬性保持不變。

zookeeper.connect=XXX:2181 
zookeeper.connection.timeout.ms=6000 
group.id=test-consumer-group 

據我瞭解,如果節點支撐/作用GroupCoordinator和__consumer_offsets死了,那麼消費者無法儘管當選分區的新領導人恢復正常運營。

我在post上看到類似的東西。這篇文章建議重啓死券商節點。但是,儘管在生產環境中重新啓動代理0之前有更多的節點,但仍會延遲消息消耗。

Q1:如何緩解上述情況?

Q2:有沒有辦法將GroupCoordinator,__consumer_offsets更改爲另一個節點?

任何建議/幫助表示讚賞。

回答

2

檢查__consumer_offsets主題上的複製因子。如果不是3那麼這就是你的問題。

運行以下命令kafka-topics --zookeeper localhost:2181 --describe --topic __consumer_offsets並查看是否在第一行輸出中顯示「ReplicationFactor:1」或「ReplicationFactor:3」。

這是試驗首先設置一個節點時的常見問題,然後使用複製因子1創建此主題。稍後,當您擴展到3個節點時,您忘記更改此現有主題的主題級別設置,您正在生產和消費的主題具有容錯性,因此偏移量主題仍然僅限於代理0。

+0

謝謝漢斯。我沒有更改創建的默認主題的複製因子。我會嘗試建議 – FindingTheOne

+0

我知道您沒有更改複製因子,但是如果您首先在一個節點羣集上執行了任何發佈/訂閱,那麼在將羣集擴展到3個節點之前,您可能會發現偏移量主題是使用複製因子爲1.運行以下命令'kafka-topics --zookeeper localhost:2181 --describe --topic __consumer_offsets'並查看在第一行輸出中是否顯示「ReplicationFactor:1」或「ReplicationFactor:3」。 –

+0

是的,複製因子是1.我將不得不改變它 – FindingTheOne

相關問題