在Neo4j中，能找到其關係是另一個節點關係超集的所有節點嗎？

鑑於以下做作數據庫：在Neo4j中，能找到其關係是另一個節點關係超集的所有節點嗎？

CREATE (a:Content {id:'A'}), 
    (b:Content {id:'B'}), 
    (c:Content {id:'C'}), 
    (d:Content {id:'D'}), 
    (ab:Container {id:'AB'}), 
    (ab2:Container {id:'AB2'}), 
    (abc:Container {id:'ABC'}), 
    (abcd:Container {id:'ABCD'}), 
    ((ab)-[:CONTAINS]->(a)), 
    ((ab)-[:CONTAINS]->(b)), 
    ((ab2)-[:CONTAINS]->(a)), 
    ((ab2)-[:CONTAINS]->(b)), 
    ((abc)-[:CONTAINS]->(a)), 
    ((abc)-[:CONTAINS]->(b)), 
    ((abc)-[:CONTAINS]->(c)), 
    ((abcd)-[:CONTAINS]->(a)), 
    ((abcd)-[:CONTAINS]->(b)), 
    ((abcd)-[:CONTAINS]->(c)), 
    ((abcd)-[:CONTAINS]->(d))

有沒有能檢測所有Container節點對的查詢，其中一個CONTAINS任何的超集或相同Content節點作爲其他Container節點？

就我的示例數據庫，我想查詢返回：

(ABCD) is a superset of (ABC), (AB), and (AB2) 
(ABC) is a superset of (AB), and (AB2) 
(AB) and (AB2) contain the same nodes

如果暗號是不適合這個，但另一種查詢語言非常適合於它，或者如果Neo4j的是不適合這個，但另一個數據庫非常適合它，我也很欣賞這方面的投入。

回答查詢性能（爲2017-02-28T21：56Z）

我沒有足夠的經驗然而，隨着Neo4j的或圖形數據庫的查詢，分析答案的性能，和我還沒有構建我的大數據集來進行更有意義的比較，但我想我會使用PROFILE命令運行每個數據集並列出數據庫命中成本。我省略了時間數據，因爲我無法使這樣一個小數據集保持一致或有意義。

stdob--：129總分貝擊中
戴夫貝內特：46總分貝擊中
InverseFalcon：27總分貝擊中

來源

2017-02-27 Gregyski

兩個戴夫貝內特和stdob - 的答案似乎給了我，我要求的結果，謝謝。我已經提出了兩項提案，並且一旦我在更大的數據集上嘗試過它們，就會給予答案，因爲我不得不選擇一個答案。 – Gregyski

關於大型數據集中有多少個Container節點？ – InverseFalcon

我還沒有組裝它（這需要做一些工作，現在我的議程上已經開始，我知道我有可行的工具來完成後面的計算）。然而，70,000容器似乎是一個現實的估計。每個容器的內容範圍從幾個到幾百個不等，但平均大概是30個。 – Gregyski

這裏是第一次嘗試。我相信這可以使用一些細化，但這應該讓你去。

// find the containers and their contents 
match (n:Container)-[:CONTAINS]->(c:Content) 

// group the contents per container 
with n as container, collect(c.id) as contents 

// combine the continers and their contents 
with collect(container{.id, contents: contents}) as containers 

// loop through the list of containers 
with containers, size(containers) as container_size 
unwind range(0, container_size -1) as i 
unwind range(0, container_size -1) as j 

// for each container pair compare the contents 
with containers, i, j 
where i <> j 
and all(content IN containers[j].contents WHERE content in containers[i].contents) 
with containers[i].id as superset, containers[j].id as subset 
return superset, collect(subset) as subsets

來源

2017-02-27 20:14:59

// Get contents for each container 
MATCH (SS:Container)-[:CONTAINS]->(CT:Content) 
     WITH SS, 
      collect(distinct CT) as CTS 
// Get all container not equal SS 
MATCH (T:Container) 
     WHERE T <> SS 
// For each container get their content 
MATCH (T)-[:CONTAINS]->(CT:Content) 
     // Test if nestd 
     WITH SS, 
     CTS, 
     T, 
     ALL(ct in collect(distinct CT) WHERE ct in CTS) as test 
     WHERE test = true 
RETURN SS, collect(T)

來源

2017-02-27 20:33:29

我會用，讓容器及其收集的內容後，該方法是通過對其內容的計數來過濾下來，其容器被互相比較，然後運行apoc.coll.containsAll() from APOC Procedures來篩選的超集/同等組。最後，你可以比較內容的數量來判斷它是超集還是同等集，然後收集。

事情是這樣的：

match (con:Container)-[:CONTAINS]->(content) 
with con, collect(content) as contents 
with collect({con:con, contents:contents, size:size(contents)}) as all 
unwind all as first 
unwind all as second 
with first, second 
where first <> second and first.size >= second.size 
with first, second 
where apoc.coll.containsAll(first.contents, second.contents) 
with first, 
case when first.size = second.size and id(first.con) < id(second.con) then second end as same, 
case when first.size > second.size then second end as superset 
with first.con as container, collect(same.con) as sameAs, collect(superset.con) as supersetOf 
where size(sameAs) > 0 or size(supersetOf) > 0 
return container, sameAs, supersetOf 
order by size(supersetOf) desc, size(sameAs) desc

來源

2017-02-28 13:09:56 InverseFalcon

在Neo4j中，能找到其關係是另一個節點關係超集的所有節點嗎？

回答

相關問題