從具有數百萬條邊的節點遍歷的超時

我有一個圖形，它有一些具有數百萬條入射邊緣的節點，在Cassandra DB上使用Titan 0.5.2。例如。該再現這樣的圖表：從具有數百萬條邊的節點遍歷的超時

mgmt = g.getManagementSystem() 
vidp = mgmt.makePropertyKey('vid').dataType(Integer.class).make() 
mgmt.buildIndex('by_vid',Vertex.class).addKey(vidp).buildCompositeIndex() 
mgmt.commit() 

def v0 = g.addVertex([vid: 0, type: 'start']) 
def random = new Random() 
for(i in 1..10000000) { 
    def v = g.addVertex([vid: i, type: 'claim']) 
    v.addEdge('is-a', v0) 
    def n = random.nextInt(i) 
    def vr = g.V('vid', n).next() 
    v.addEdge('test', vr) 
    if (i%10000 == 0) { g.commit(); } 
}

因此，我們有10M的頂點，所有鏈接到V0和與頂點之間的一些隨機的聯繫。此查詢：g.V('vid', 0).in('is-a')[0] - 正常工作，g.V('vid', 0).in('is-a')[100]或g.V('vid', 0).in('is-a')[1000]也是如此。但是，如果我嘗試再經過 - 即g.V('vid', 0).in('is-a').out('test')[0] - 然後查找卡住和卡桑德拉最終我得到讀超時異常：

com.thinkaurelius.titan.core.TitanException: Could not execute operation due to backend exception 

Caused by: com.thinkaurelius.titan.diskstorage.TemporaryBackendException: Could not successfully complete backend operation due to repeated temporary exceptions after Duration[4000 ms] 
at com.thinkaurelius.titan.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:86) 
at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:42) 

Caused by: com.netflix.astyanax.connectionpool.exceptions.TimeoutException: TimeoutException: [host=127.0.0.1(127.0.0.1):9160, latency=10000(10001), attempts=1]org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out 
at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:188) 
at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:

我也得到卡桑德拉處理高負荷，它變得沒有反應（即，試圖連接到它返回超時）。所以，我的問題是，爲什麼不可能從這個節點進一步遍歷，即使實際上有很多節點的步驟很好 - 我怎樣才能使它工作？

來源

2014-12-12 StasM

看來你已經有效地模擬了一個超級節點。當你調用函數

g.V('vid', 0).in('is-a')[0]

你只需要一個對象，這是一個快速查找。同樣的：

g.V('vid', 0).in('is-a')[100]

還只請求一個對象，它仍然很快。當您查詢：

g.V('vid', 0).in('is-a').out('test')[0]

你剛剛讓這一請求「查找我的一切來自出局邊緣連接從頂點億頂點，返回的第一個」。它會做的第一步是遍歷所有這些邊緣中的所有邊緣，然後才能返回您請求的「第一個」頂點。試試這樣做：

g.V('vid', 0).in('is-a')[0].out('test')[0]

這不會迭代所有一百萬個頂點。

來源

2015-02-26 20:01:58 pkohan

從具有數百萬條邊的節點遍歷的超時

回答

相關問題