2016-11-08 65 views
1

我有一個numOwners = 2的3節點infinispan集羣,當節點從網絡斷開連接並加入時,我遇到了集羣視圖的問題。以下是日誌:infinispan和jgroups的不正確的合併視圖

(來電-1,BrokerPE-0-28575)ISPN000094:收到新集羣用於信道ISPN視圖:[BrokerPE-0-28575 | 2] (3) [BrokerPE-0-28575 ,SEM03VVM-201-59385,SEM03VVM-202-33714]

ISPN000094:通道ISPN接收的新集羣視圖:[BrokerPE-0-28575 | 3] (2) [BrokerPE-0-28575,SEM03VVM- 202-33714] - >一個節點斷開

ISPN000093:收到新的合併簇的信道ISPN視圖:MergeView :: [BrokerPE-0-28575 | 4](2)[BrokerPE-0-28 (Broker PE-0-28575 | 3](2)[Broker PE-0-28575,SEM03VVM-202-33714],[Broker PE-0-28575 | 2](3)[575,SEM03VVM-201-59385],2個亞組: [BrokerPE-0-28575,SEM03VVM-201-59385,SEM03VVM-202-33714] - >不正確合併

以下是我的JGroups配置:

<config xmlns="urn:org:jgroups" 
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
     xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups-3.6.xsd"> 
    <TCP 
      bind_addr="${jgroups.tcp.address:127.0.0.1}" 
     bind_port="${jgroups.tcp.port:7800}" 
     loopback="true" 
     port_range="30" 
     recv_buf_size="20m" 
     send_buf_size="640k" 
     max_bundle_size="31k" 
     use_send_queues="true" 
     enable_diagnostics="false" 
     sock_conn_timeout="300" 
     bundler_type="old" 

     thread_naming_pattern="pl" 

     timer_type="new3" 
     timer.min_threads="4" 
     timer.max_threads="10" 
     timer.keep_alive_time="3000" 
     timer.queue_max_size="500" 


     thread_pool.enabled="true" 
     thread_pool.min_threads="2" 
     thread_pool.max_threads="30" 
     thread_pool.keep_alive_time="60000" 
     thread_pool.queue_enabled="true" 
     thread_pool.queue_max_size="100" 
     thread_pool.rejection_policy="Discard" 

     oob_thread_pool.enabled="true" 
     oob_thread_pool.min_threads="2" 
     oob_thread_pool.max_threads="30" 
     oob_thread_pool.keep_alive_time="60000" 
     oob_thread_pool.queue_enabled="false" 
     oob_thread_pool.queue_max_size="100" 
     oob_thread_pool.rejection_policy="Discard" 

     internal_thread_pool.enabled="true" 
     internal_thread_pool.min_threads="1" 
     internal_thread_pool.max_threads="10" 
     internal_thread_pool.keep_alive_time="60000" 
     internal_thread_pool.queue_enabled="true" 
     internal_thread_pool.queue_max_size="100" 
     internal_thread_pool.rejection_policy="Discard" 
     /> 

    <!-- Ergonomics, new in JGroups 2.11, are disabled by default in TCPPING until JGRP-1253 is resolved --> 
    <TCPPING timeout="3000" initial_hosts="${jgroups.tcpping.initial_hosts:HostA[7800],HostB[7801]}" 
      port_range="2" 
      num_initial_members="3" 
      ergonomics="false" 
     /> 

    <!-- MPING bind_addr="${jgroups.bind_addr:127.0.0.1}" break_on_coord_rsp="true" 
     mcast_addr="${jboss.default.multicast.address:228.2.4.6}" 
     mcast_port="${jgroups.mping.mcast_port:43366}" 
     ip_ttl="${jgroups.udp.ip_ttl:2}" 
     num_initial_members="3"/--> 
    <!-- <MPING bind_addr="${jgroups.bind_addr:127.0.0.1}" break_on_coord_rsp="true" 
     mcast_addr="${jboss.default.multicast.address:228.2.4.6}" 
     mcast_port="${jgroups.mping.mcast_port:43366}" 
     ip_ttl="${jgroups.udp.ip_ttl:2}" 
     num_initial_members="3"/> --> 
    <MERGE3 max_interval="30000" min_interval="10000"/> 

    <FD_SOCK bind_addr="${jgroups.bind_addr}"/> 
    <FD timeout="3000" max_tries="3"/> 
    <VERIFY_SUSPECT timeout="3000"/> 
    <!-- <BARRIER /> --> 
    <!-- <pbcast.NAKACK use_mcast_xmit="false" retransmit_timeout="300,600,1200,2400,4800" discard_delivered_msgs="true"/> --> 
    <pbcast.NAKACK2 use_mcast_xmit="false" 
        xmit_interval="1000" 
        xmit_table_num_rows="100" 
        xmit_table_msgs_per_row="10000" 
        xmit_table_max_compaction_time="10000" 
        max_msg_batch_size="100" discard_delivered_msgs="true"/> 
    <UNICAST3 xmit_interval="500" 
      xmit_table_num_rows="20" 
      xmit_table_msgs_per_row="10000" 
      xmit_table_max_compaction_time="10000" 
      max_msg_batch_size="100" 
      conn_expiry_timeout="0"/> 

    <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" max_bytes="400000"/> 
    <pbcast.GMS print_local_addr="true" join_timeout="3000" view_bundling="true" merge_timeout="6000"/> 
    <tom.TOA/> <!-- the TOA is only needed for total order transactions--> 

    <UFC max_credits="2m" min_threshold="0.40"/> 
    <!-- <MFC max_credits="2m" min_threshold="0.40"/> --> 
    <FRAG2 frag_size="30k"/> 
    <RSVP timeout="60000" resend_interval="500" ack_on_delivery="false" /> 
    <!-- <pbcast.STATE_TRANSFER/> --> 
</config> 

我使用的Infinispan 7.0。 2和jgroups 3.6.1版本。我嘗試了很多配置,但沒有任何工作。您的幫助將不勝感激。

[UPDATE]將以下屬性設置爲1以上後,情況正常:「internal_thread_pool.min_threads」。

+0

您是否嘗試過用較新的Infinispan版本,例如8.2.4.Final? –

+0

@DanBerindei我沒有,但這裏的問題似乎與jgroups集羣合併。 – geekprogrammer

+0

@DanBerindei我們也嘗試過使用Infinispan 8.2.4,並得到同樣的問題。 – geekprogrammer

回答

1

所以爲了簡化,我們有

  • 查看經紀人| 2 = {經紀人,201,202}
  • 201葉,認爲現在是經紀人| 3 = {經紀人,202}
  • 然後在視圖broker | 3和broker | 2之間出現合併,導致視圖代理不正確| 4 = {broker,201}

我創建了[1]來調查這裏發生了什麼。首先,合併視圖的子視圖應該包括202作爲子組協調員,但事實並非如此。

你能描述一下究竟發生了什麼嗎?這可以複製嗎?這將是不錯的有FD,FD_ALL,MERGE3和GMS跟蹤級別日誌...

[1] https://issues.jboss.org/browse/JGRP-2128

+0

是的,當我們手動將我們的一個節點從網絡中斷開並將其連接回來時,它在我們的環境中始終可以重複使用。感謝您創建錯誤;我將添加跟蹤日誌。 – geekprogrammer

+0

有沒有解決這個問題的方法? – geekprogrammer