Apache Curator - Zookeeper連接丟失異常，可能的內存泄漏

我一直在研究一個持續監視分佈式原子長計數器的進程。它使用以下類ZkClient的方法getCounter每分鐘監控一次。實際上，我有多個線程運行，每個線程都監視存儲在Zookeeper節點中的不同計數器（分佈式原子長）。每個線程通過getCounter方法的參數指定計數器的路徑。Apache Curator - Zookeeper連接丟失異常，可能的內存泄漏

public class TagserterZookeeperManager { 

public enum ZkClient { 
    COUNTER("10.11.18.25:2181"); // Integration URL 

    private CuratorFramework client; 
    private ZkClient(String servers) { 
     Properties props = TagserterConfigs.ZOOKEEPER.getProperties(); 
     String zkFromConfig = props.getProperty("servers", ""); 
     if (zkFromConfig != null && !zkFromConfig.isEmpty()) { 
      servers = zkFromConfig.trim(); 
     } 
     ExponentialBackoffRetry exponentialBackoffRetry = new ExponentialBackoffRetry(1000, 3); 
     client = CuratorFrameworkFactory.newClient(servers, exponentialBackoffRetry); 
     client.start(); 
    } 

    public CuratorFramework getClient() { 
     return client; 
    } 
} 

public static String buildPath(String ... node) { 
    StringBuilder sb = new StringBuilder(); 
    for (int i = 0; i < node.length; i++) { 
     if (node[i] != null && !node[i].isEmpty()) { 
      sb.append("/"); 
      sb.append(node[i]); 
     } 
    } 
    return sb.toString(); 
} 

public static DistributedAtomicLong getCounter(String taskType, int hid, String jobId, String countType) { 
    String path = buildPath(taskType, hid+"", jobId, countType); 
    Builder builder = PromotedToLock.builder().lockPath(path + "/lock").retryPolicy(new ExponentialBackoffRetry(10, 10)); 
    DistributedAtomicLong count = new DistributedAtomicLong(ZkClient.COUNTER.getClient(), path, new RetryNTimes(5, 20), builder.build()); 
    return count; 
} 

}

從線程內，這是怎麼了調用這個方法：

DistributedAtomicLong counterTotal = TagserterZookeeperManager 
         .getCounter("testTopic", hid, jobId, "test");

現在好像之後線程已經運行了幾個小時，在一個階段我開始得到以下

org.apache.zookeeper.KeeperException $ ConnectionLossException：KeeperErrorCode = ConnectionLoss用於/康特它嘗試讀取計數getCounter方法內org.apache.zookeeper.KeeperException$ConnectionLossException例外ntTaskProd at org.apache.zookeeper.KeperException.create（KeeperException.java:99） at org.apache.zookeeper.KeeperException.create（KeeperException.java:51） at org.apache.zookeeper.ZooKeeper.exists（ZooKeeper .java：1045） at org.apache.zookeeper.ZooKeeper.exists（ZooKeeper.java:1073） at org.apache.curator.utils.ZKPaths.mkdirs（ZKPaths.java:215） at org.apache.curator .utils.EnsurePath $ InitialHelper $ 1.call（EnsurePath.java:148） at org.apache.curator.RetryLoop.callWithRetry（RetryLoop.java:107） at org.apache.curator.utils.EnsurePath $ InitialHelper.ensure（ EnsurePath.java:141） at org.apache.curator.utils.EnsurePath.ensure（EnsurePath.java:99） at org.apache.curator.fram ework.recipes.atomic.DistributedAtomicValue.getCurrentValue（DistributedAtomicValue.java:254） at org.apache.curator.framework.recipes.atomic.DistributedAtomicValue.get（DistributedAtomicValue.java:91） at org.apache.curator.framework。 recipes.atomic.DistributedAtomicLong.get（DistributedAtomicLong.java:72） ...

我不斷收到來自其此異常了一會兒，我把它會引起一些內部內存泄漏，最終導致了感覺OutOfMemory錯誤並且整個過程都被解除。有沒有人知道這可能是什麼原因？爲什麼Zookeeper突然開始拋出連接丟失異常？在進程退出後，我可以通過我編寫的另一個小控制檯程序（也使用curator）手動連接到Zookeeper，並且在那裏看起來都很好。

來源

2015-11-07 Asif Iqbal

嗨，你是怎麼最終解決這個問題的？即使在Curator框架上顯式調用close（），我也似乎遇到同樣的問題。 –

@SumitNigam對不起，在這一個遲到回到你。其實我已經停止了那個項目的工作，從那時起它已經有一段時間了。事實證明，我們可能需要重新編寫和重構項目的主要部分，其原因有很多。對於那個很抱歉。 –

爲了使用curator來監視Zookeeper中的節點，您可以使用NodeCache這不會解決您的連接問題....但是，不是每分鐘輪詢一次該節點，您可以在它發生更改時獲取推送事件。

根據我的經驗，NodeCache可以很好地處理斷開連接並恢復連接。

來源

2016-01-11 20:30:52

Apache Curator - Zookeeper連接丟失異常，可能的內存泄漏

回答

相關問題