我一直在研究一個持續監視分佈式原子長計數器的進程。它使用以下類ZkClient
的方法getCounter
每分鐘監控一次。實際上,我有多個線程運行,每個線程都監視存儲在Zookeeper節點中的不同計數器(分佈式原子長)。每個線程通過getCounter
方法的參數指定計數器的路徑。Apache Curator - Zookeeper連接丟失異常,可能的內存泄漏
public class TagserterZookeeperManager {
public enum ZkClient {
COUNTER("10.11.18.25:2181"); // Integration URL
private CuratorFramework client;
private ZkClient(String servers) {
Properties props = TagserterConfigs.ZOOKEEPER.getProperties();
String zkFromConfig = props.getProperty("servers", "");
if (zkFromConfig != null && !zkFromConfig.isEmpty()) {
servers = zkFromConfig.trim();
}
ExponentialBackoffRetry exponentialBackoffRetry = new ExponentialBackoffRetry(1000, 3);
client = CuratorFrameworkFactory.newClient(servers, exponentialBackoffRetry);
client.start();
}
public CuratorFramework getClient() {
return client;
}
}
public static String buildPath(String ... node) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < node.length; i++) {
if (node[i] != null && !node[i].isEmpty()) {
sb.append("/");
sb.append(node[i]);
}
}
return sb.toString();
}
public static DistributedAtomicLong getCounter(String taskType, int hid, String jobId, String countType) {
String path = buildPath(taskType, hid+"", jobId, countType);
Builder builder = PromotedToLock.builder().lockPath(path + "/lock").retryPolicy(new ExponentialBackoffRetry(10, 10));
DistributedAtomicLong count = new DistributedAtomicLong(ZkClient.COUNTER.getClient(), path, new RetryNTimes(5, 20), builder.build());
return count;
}
}
從線程內,這是怎麼了調用這個方法:
DistributedAtomicLong counterTotal = TagserterZookeeperManager
.getCounter("testTopic", hid, jobId, "test");
現在好像之後線程已經運行了幾個小時,在一個階段我開始得到以下
org.apache.zookeeper.KeeperException $ ConnectionLossException:KeeperErrorCode = ConnectionLoss用於/康特它嘗試讀取計數
getCounter
方法內org.apache.zookeeper.KeeperException$ConnectionLossException
例外ntTaskProd at org.apache.zookeeper.KeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper .java:1045) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1073) at org.apache.curator.utils.ZKPaths.mkdirs(ZKPaths.java:215) at org.apache.curator .utils.EnsurePath $ InitialHelper $ 1.call(EnsurePath.java:148) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) at org.apache.curator.utils.EnsurePath $ InitialHelper.ensure( EnsurePath.java:141) at org.apache.curator.utils.EnsurePath.ensure(EnsurePath.java:99) at org.apache.curator.fram ework.recipes.atomic.DistributedAtomicValue.getCurrentValue(DistributedAtomicValue.java:254) at org.apache.curator.framework.recipes.atomic.DistributedAtomicValue.get(DistributedAtomicValue.java:91) at org.apache.curator.framework。 recipes.atomic.DistributedAtomicLong.get(DistributedAtomicLong.java:72) ...
我不斷收到來自其此異常了一會兒,我把它會引起一些內部內存泄漏,最終導致了感覺OutOfMemory錯誤並且整個過程都被解除。有沒有人知道這可能是什麼原因?爲什麼Zookeeper突然開始拋出連接丟失異常?在進程退出後,我可以通過我編寫的另一個小控制檯程序(也使用curator)手動連接到Zookeeper,並且在那裏看起來都很好。
嗨,你是怎麼最終解決這個問題的?即使在Curator框架上顯式調用close(),我也似乎遇到同樣的問題。 –
@SumitNigam對不起,在這一個遲到回到你。其實我已經停止了那個項目的工作,從那時起它已經有一段時間了。事實證明,我們可能需要重新編寫和重構項目的主要部分,其原因有很多。對於那個很抱歉。 –