2013-03-26 61 views
3

我想獲得沒有特定關係(與特定屬性的關係)的節點。如何驗證特定路徑不存在於密碼查詢

該圖包含實體節點(n),它出現在文件(f)中的特定行(line_nr)處。

當前查詢我有如下:

start n=node:entities("text:*") 
MATCH p=(n)-[left:OCCURS]->(f) 
    , p4=(f)<-[right4?:OCCURS]-(n4) 
    , p7=(f)<-[right7?:OCCURS]-(n7) 
WHERE ( ( 
    (n4.text? =~ "nonreachablenodestextregex" AND (p4 = null OR left.line_nr < right4.line_nr - 0 OR left.line_nr > right4.line_nr + 0 OR ID(left) = ID(right4) ) ) ) 
    AND ( 
    (n7.text? =~ "othernonreachablenodestextregex" AND (p7 = null OR left.line_nr < right7.line_nr - 0 OR left.line_nr > right7.line_nr + 0 OR ID(left) = ID(right7) ) ) ) ) 
WITH n, left, f, count(*) as group_by_cause 
RETURN ID(left) as occ_id, 
     n.text as ent_text, 
     substring(f.text, ABS(left.file_offset-1), 2 + LENGTH(n.text)) as occ_text, 
     f.path as file_path, 
     left.line_nr as occ_line_nr, 
     ID(f) as file_id 

相反,MATCH子句中一個新的路徑,我認爲這也有可能有:

NOT ((f)<-[right4:OCCURS]-(n4)) 

但是,我不想排除任何路徑的存在,但要具體路徑。

作爲一種替代解決方案,我想包括額外的開始節點(因爲我有一個不可達的節點上的索引),以刪除WHERE子句中的文本比較。但是,如果neo4j中沒有與通配符匹配的節點,則不會返回任何內容。

start n=node:entities("text:*") 
    , n4=node:entities("text:nonreachablenodestextwildcard") 
    , n7=node:entities("text:othernonreachablenodestextwildcard") 
MATCH p=(n)-[left:OCCURS]->(f) 
    , p4=(f)<-[right4?:OCCURS]-(n4) 
    , p7=(f)<-[right7?:OCCURS]-(n7) 
WHERE ( ( 
    ((p4 = null 
     OR left.line_nr < right4.line_nr - 0 
     OR left.line_nr > right4.line_nr + 0 
     OR ID(left) = ID(right4) ) ) ) 
    AND ( 
    ((p7 = null 
     OR left.line_nr < right7.line_nr - 0 
     OR left.line_nr > right7.line_nr + 0 
     OR ID(left) = ID(right7) ) ) 
) ) 

舊更新: 由於在答案中提到,我可以使用的謂詞函數來構造內部查詢。因此,我更新了查詢:

start n=node:entities("text:*") 
MATCH p=(n)-[left:OCCURS]->(f) 
WHERE ( ( 
    (NONE(path in (f)<-[:OCCURS]-(n4) 
     WHERE 
      (LAST(nodes(path))).text =~ "nonreachablenodestextRegex" 
      AND FIRST(r4 in rels(p)).line_nr <= left.line_nr 
      AND FIRST(r4 in rels(p)).line_nr >= left.line_nr 
    ) 
    )) 
    AND ( 
    (NONE(path in (f)<-[:OCCURS]-(n7) 
     WHERE 
      (LAST(nodes(path))).text =~ "othernonreachablenodestextRegex" 
      AND FIRST(r7 in rels(p)).line_nr <= left.line_nr 
      AND FIRST(r7 in rels(p)).line_nr >= left.line_nr 
    ) 
    )) 
    ) 
WITH n, left, f, count(*) as group_by_cause 
RETURN .... 

這給了我一個java.lang.OutOfMemoryException

java.lang.OutOfMemoryError: Java heap space 
at java.util.regex.Pattern.compile(Pattern.java:1432) 
at java.util.regex.Pattern.<init>(Pattern.java:1133) 
at java.util.regex.Pattern.compile(Pattern.java:823) 
at scala.util.matching.Regex.<init>(Regex.scala:38) 
at scala.collection.immutable.StringLike$class.r(StringLike.scala:226) 
at scala.collection.immutable.StringOps.r(StringOps.scala:31) 
at org.neo4j.cypher.internal.parser.v1_9.Base.ignoreCase(Base.scala:31) 
at org.neo4j.cypher.internal.parser.v1_9.Base.ignoreCases(Base.scala:49) 
at org.neo4j.cypher.internal.parser.v1_9.Base$$anonfun$ignoreCases$1.apply(Base.scala:49) 
at org.neo4j.cypher.internal.parser.v1_9.Base$$anonfun$ignoreCases$1.apply(Base.scala:49) 
at scala.util.parsing.combinator.Parsers$Parser.p$3(Parsers.scala:209) 
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210) 
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210) 
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:163) 
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:210) 
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:210) 
at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:183) 
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210) 
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210) 
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:163) 

(最後6行重複多做幾次)

解決方案 上一次更新可能在某處包含語法錯誤,修正它稍有不同如下:

start n=node:entities("text:*") 
MATCH p=(n)-[left:OCCURS]->(f) 
WHERE ( 
(NONE (path in (f)<-[:OCCURS]-() 
    WHERE 
     ANY(n4 in nodes(path) 
     WHERE ID(n4) <> ID(n) 
      AND n4.type = 'ENTITY' 
      AND n4.text =~ "a regex expr" 
     ) 
     AND ALL(r4 in rels(path) 
     WHERE r4.line_nr <= left.line_nr + 0 
      AND r4.line_nr >= left.line_nr - 0 
     ) 
    ) 
)) 
AND 
NONE (...... ) 
WITH n, left, f, count(*) as group_by_cause 
RETURN ... 

然而它很慢。的秒級(> 10),用於小圖: 4個實體節點和6:發生在總的關係,所有爲1個單個目的地˚F節點,用0和3之間line_nr的

性能更新 以下大約是其速度的兩倍:

start n=node:entities("text:*") 
MATCH p=(n)-[left:OCCURS]->(f) 
, p4=(f)<-[right4?:OCCURS]-(n4) 
, p7=(f)<-[right7?:OCCURS]-(n7) 
WHERE 
(n4.text? =~ "regex1" 
    AND (p4 = null 
     OR left.line_nr < right4.line_nr - 0 
     OR left.line_nr > right4.line_nr + 0 
     OR ID(left) = ID(right4) 
     ) 
) 
AND 
(n7.text? =~ "regex2" 
    AND (p7 = null .....) 
) 
WITH n, left, f, count(*) as group_by_cause 
RETURN .... 

回答

0

我認爲,您應該使用WHERE中的模式謂詞,而不是可選的關係。該模式的表達式實際上返回的路徑的集合,所以你可以做集合謂詞像(ALL, NONE, ANY, SINGLE

WHERE NONE(path in (f)<-[:OCCURS]-(n4) WHERE 
      ALL(r in rels(p) : r.line_nr = 42)) 

見:http://docs.neo4j.org/chunked/milestone/query-function.html#_predicates

+0

編譯器抱怨說,它在的位置需要一個WHERE: 固定的是,並替換p的路徑,並用()替換(n4),確實修復了查詢。 – gerben 2013-03-27 15:11:24