2015-01-21 28 views
6

我問了一個關於沿圖聚合數量的問題earlier。提供的兩個答案運行良好,但現在我試圖將Cypher查詢擴展爲可變深度圖。使用子圖聚合的遞歸查詢(任意深度)

總而言之,我們從一堆葉商店開始,它們都與一個特定供應商相關聯,這是Store節點上的一個屬性。然後庫存被移動到其他商店,並且來自每個供應商的比例對應於他們對原始商店的貢獻。

因此對於節點B02,S2貢獻750/1250 = 60%S3貢獻40%。然後我們將其中B02其中60%屬於S240%的600個單元移動到S3等。

enter image description here

我們想知道屬於每個供應商有多大比例的最後700套入D01什麼。具有相同名稱的供應商是同一供應商的。因此,對於上圖我們預計:

S1,38.09
S2,27.61
S3,34.28

我準備用這個暗號腳本圖:

CREATE (A01:Store {Name: 'A01', Supplier: 'S1'}) 
CREATE (A02:Store {Name: 'A02', Supplier: 'S1'}) 
CREATE (A03:Store {Name: 'A03', Supplier: 'S2'}) 
CREATE (A04:Store {Name: 'A04', Supplier: 'S3'}) 
CREATE (A05:Store {Name: 'A05', Supplier: 'S1'}) 
CREATE (A06:Store {Name: 'A06', Supplier: 'S1'}) 
CREATE (A07:Store {Name: 'A07', Supplier: 'S2'}) 
CREATE (A08:Store {Name: 'A08', Supplier: 'S3'}) 

CREATE (B01:Store {Name: 'B01'}) 
CREATE (B02:Store {Name: 'B02'}) 
CREATE (B03:Store {Name: 'B03'}) 
CREATE (B04:Store {Name: 'B04'}) 

CREATE (C01:Store {Name: 'C01'}) 
CREATE (C02:Store {Name: 'C02'}) 

CREATE (D01:Store {Name: 'D01'}) 

CREATE (A01)-[:MOVE_TO {Quantity: 750}]->(B01) 
CREATE (A02)-[:MOVE_TO {Quantity: 500}]->(B01) 
CREATE (A03)-[:MOVE_TO {Quantity: 750}]->(B02) 
CREATE (A04)-[:MOVE_TO {Quantity: 500}]->(B02) 
CREATE (A05)-[:MOVE_TO {Quantity: 100}]->(B03) 
CREATE (A06)-[:MOVE_TO {Quantity: 200}]->(B03) 
CREATE (A07)-[:MOVE_TO {Quantity: 50}]->(B04) 
CREATE (A08)-[:MOVE_TO {Quantity: 450}]->(B04) 

CREATE (B01)-[:MOVE_TO {Quantity: 400}]->(C01) 
CREATE (B02)-[:MOVE_TO {Quantity: 600}]->(C01) 
CREATE (B03)-[:MOVE_TO {Quantity: 100}]->(C02) 
CREATE (B04)-[:MOVE_TO {Quantity: 200}]->(C02) 

CREATE (C01)-[:MOVE_TO {Quantity: 500}]->(D01) 
CREATE (C02)-[:MOVE_TO {Quantity: 200}]->(D01) 

當前查詢是這樣的:

MATCH (s:Store { Name:'D01' }) 
MATCH (s)<-[t:MOVE_TO]-()<-[r:MOVE_TO]-(supp) 
WITH t.Quantity as total, collect(r) as movements 
WITH total, movements, reduce(totalSupplier = 0, r IN movements | totalSupplier + r.Quantity) as supCount 
UNWIND movements as movement 
RETURN startNode(movement).Supplier as Supplier, round(100.0*movement.Quantity/supCount) as pct 

我想用遞歸關係,沿此線的東西:

MATCH (s)<-[t:MOVE_TO]-()<-[r:MOVE_TO*]-(supp) 

然而,讓多條路徑的末端節點,我需要在我認爲每個節點聚集庫存。

+0

我想這一點,但問題是,我不認爲CYPHER確實遞歸。 Cypher每次評估一個子圖時使用它的「MATCH」,在這種情況下,它是跨越樹深度的一條路徑。但是,您想比較彼此的路徑 – 2015-01-21 08:57:44

+1

另外,如果您只想從商店到原始供應商節點的路徑,您希望像'MATCH(target:Store {Name:'D01'})< - [ r:MOVE_TO *] - (source:Store)WHERE source.Supplier IS NOT NULL' – 2015-01-21 08:58:54

+0

除了Brians的建議,同樣可以使用WHERE NOT(source)< - [:MOVE_TO] - ()' – JohnMark13 2015-01-23 10:45:43

回答

2

此查詢爲任何符合問題中描述的模型的任意圖生成正確的結果。 (當Store X移動商品到Store Y,假設移動後的商品的Supplier百分比是相同Store X。)

然而,這種解決方案不包括僅單個的Cypher查詢的(因爲這可能不可能)。相反,它涉及多個查詢,其中之一必須迭代,直到計算級聯整個Store節點圖爲止。該迭代查詢將清楚地告訴你什麼時候停止迭代。其他Cypher查詢需要:準備迭代圖,報告「最終」節點的供應商百分比,並清理該圖(以使其恢復到之前步驟1之前的狀態) 。

這些查詢可能會進一步優化。

以下是必需的步驟:

  1. 準備圖表用於迭代查詢(初始化臨時pcts陣列,用於所有的起始Store節點)。這包括創建一個具有包含所有供應商名稱的數組的單節點Suppliers節點。這用於建立臨時pcts數組元素的順序,並將這些元素映射回正確的供應商名稱。

    MATCH (store:Store) 
    WHERE HAS (store.Supplier) 
    WITH COLLECT(store) AS stores, COLLECT(DISTINCT store.Supplier) AS csup 
    CREATE (sups:Suppliers { names: csup }) 
    WITH stores, sups 
    UNWIND stores AS store 
    SET store.pcts = 
        EXTRACT(i IN RANGE(0,LENGTH(sups.names)-1,1) | 
        CASE WHEN store.Supplier = sups.names[i] THEN 1.0 ELSE 0.0 END) 
    RETURN store.Name, store.Supplier, store.pcts; 
    

    這裏是問題的數據結果:

    +---------------------------------------------+ 
    | store.Name | store.Supplier | store.pcts | 
    +---------------------------------------------+ 
    | "A01"  | "S1"   | [1.0,0.0,0.0] | 
    | "A02"  | "S1"   | [1.0,0.0,0.0] | 
    | "A03"  | "S2"   | [0.0,1.0,0.0] | 
    | "A04"  | "S3"   | [0.0,0.0,1.0] | 
    | "A05"  | "S1"   | [1.0,0.0,0.0] | 
    | "A06"  | "S1"   | [1.0,0.0,0.0] | 
    | "A07"  | "S2"   | [0.0,1.0,0.0] | 
    | "A08"  | "S3"   | [0.0,0.0,1.0] | 
    +---------------------------------------------+ 
    8 rows 
    83 ms 
    Nodes created: 1 
    Properties set: 9 
    
  2. 迭代查詢(反覆運行,直到返回0行)

    MATCH p=(s1:Store)-[m:MOVE_TO]->(s2:Store) 
    WHERE HAS(s1.pcts) AND NOT HAS(s2.pcts) 
    SET s2.pcts = EXTRACT(i IN RANGE(1,LENGTH(s1.pcts),1) | 0) 
    WITH s2, COLLECT(p) AS ps 
    WITH s2, ps, REDUCE(s=0, p IN ps | s + HEAD(RELATIONSHIPS(p)).Quantity) AS total 
    FOREACH(p IN ps | 
        SET HEAD(RELATIONSHIPS(p)).pcts = EXTRACT(parentPct IN HEAD(NODES(p)).pcts | parentPct * HEAD(RELATIONSHIPS(p)).Quantity/total) 
    ) 
    FOREACH(p IN ps | 
        SET s2.pcts = EXTRACT(i IN RANGE(0,LENGTH(s2.pcts)-1,1) | s2.pcts[i] + HEAD(RELATIONSHIPS(p)).pcts[i]) 
    ) 
    RETURN s2.Name, s2.pcts, total, EXTRACT(p IN ps | HEAD(RELATIONSHIPS(p)).pcts) AS rel_pcts; 
    

    迭代1的結果:

    +-----------------------------------------------------------------------------------------------+ 
    | s2.Name | s2.pcts  | total | rel_pcts             | 
    +-----------------------------------------------------------------------------------------------+ 
    | "B04" | [0.0,0.1,0.9] | 500 | [[0.0,0.1,0.0],[0.0,0.0,0.9]]        | 
    | "B01" | [1.0,0.0,0.0] | 1250 | [[0.6,0.0,0.0],[0.4,0.0,0.0]]        | 
    | "B03" | [1.0,0.0,0.0] | 300 | [[0.3333333333333333,0.0,0.0],[0.6666666666666666,0.0,0.0]] | 
    | "B02" | [0.0,0.6,0.4] | 1250 | [[0.0,0.6,0.0],[0.0,0.0,0.4]]        | 
    +-----------------------------------------------------------------------------------------------+ 
    4 rows 
    288 ms 
    Properties set: 24 
    

    迭代2結果:

    +-------------------------------------------------------------------------------------------------------------------------------+ 
    | s2.Name | s2.pcts          | total | rel_pcts              | 
    +-------------------------------------------------------------------------------------------------------------------------------+ 
    | "C02" | [0.3333333333333333,0.06666666666666667,0.6] | 300 | [[0.3333333333333333,0.0,0.0],[0.0,0.06666666666666667,0.6]] | 
    | "C01" | [0.4,0.36,0.24]        | 1000 | [[0.4,0.0,0.0],[0.0,0.36,0.24]]        | 
    +-------------------------------------------------------------------------------------------------------------------------------+ 
    2 rows 
    193 ms 
    Properties set: 12 
    

    迭代3結果:

    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 
    | s2.Name | s2.pcts              | total | rel_pcts                             | 
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 
    | "D01" | [0.38095238095238093,0.27619047619047615,0.34285714285714286] | 700 | [[0.2857142857142857,0.2571428571428571,0.17142857142857143],[0.09523809523809522,0.01904761904761905,0.17142857142857143]] | 
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 
    1 row 
    40 ms 
    Properties set: 6 
    

    迭代4結果:

    +--------------------------------------+ 
    | s2.Name | s2.pcts | total | rel_pcts | 
    +--------------------------------------+ 
    +--------------------------------------+ 
    0 rows 
    69 ms 
    
  3. 列表非零Supplier百分比結束Store(多個)節點。

    MATCH (store:Store), (sups:Suppliers) 
    WHERE NOT (store:Store)-[:MOVE_TO]->(:Store) AND HAS(store.pcts) 
    RETURN store.Name, [i IN RANGE(0,LENGTH(sups.names)-1,1) WHERE store.pcts[i] > 0 | {supplier: sups.names[i], pct: store.pcts[i] * 100}] AS pcts; 
    

    結果:

    +----------------------------------------------------------------------------------------------------------------------------------+ 
    | store.Name | pcts                            | 
    +----------------------------------------------------------------------------------------------------------------------------------+ 
    | "D01"  | [{supplier=S1, pct=38.095238095238095},{supplier=S2, pct=27.619047619047617},{supplier=S3, pct=34.285714285714285}] | 
    +----------------------------------------------------------------------------------------------------------------------------------+ 
    1 row 
    293 ms 
    
  4. 清理(刪除所有臨時pcts道具和Suppliers節點)。

    MATCH (s:Store), (sups:Suppliers) 
    OPTIONAL MATCH (s)-[m:MOVE_TO]-() 
    REMOVE m.pcts, s.pcts 
    DELETE sups; 
    

    結果:

    0 rows 
    203 ms 
    +-------------------+ 
    | No data returned. | 
    +-------------------+ 
    Properties set: 29 
    Nodes deleted: 1 
    
+0

@NedStoyanov:這是否適合你? – cybersam 2015-01-29 18:47:05

+0

謝謝@cubersam,這個查詢得到正確的結果。我在這個問題上得到了錯誤的預期結果。謝謝你的努力。 – 2015-01-29 23:27:26

+0

@Ned Stoyanov給這個人的賞金! – 2015-01-30 01:53:29

2

我不能認爲我的方式通過純密碼的解決方案,因爲我不認爲你可以在cypher中這樣做遞歸。但是,您可以使用密碼來以簡單的方式將樹中的所有數據返回給您,以便您可以用您最喜歡的編程語言來計算它。類似這樣的:

MATCH path=(source:Store)-[move:MOVE_TO*]->(target:Store {Name: 'D01'}) 
WHERE source.Supplier IS NOT NULL 
RETURN 
    source.Supplier, 
    reduce(a=[], move IN relationships(path)| a + [{id: ID(move), Quantity: move.Quantity}]) 

這將返回您每個路徑中每個關係的id和數量。那麼你可以處理客戶端(可能首先將其轉換爲嵌套數據結構?)

+0

感謝您的回答,我喜歡您的技巧,將動作積累到陣列中。我可能再等一會兒,看看是否有其他答案。 – 2015-01-21 22:37:23

+0

夠公平的;)我一定希望看到另一個答案。另外,我不知道你使用的是什麼語言,但是我應該提到,如果你正在使用java或者與java API集成的東西,你可以通過neo4j java API訪問你的數據庫。但是,您需要在嵌入式模式下運行,這有其自身的困難。 – 2015-01-22 11:39:42

+0

我們正在使用C#,所以我們希望避免編寫任何Java代碼 – 2015-01-22 12:15:02

3

正如我之前說過的,我喜歡這個問題。我知道你已經接受了一個答案,但是我決定發佈我的最終答案,因爲它也會在沒有客戶端努力的情況下返回百分點(這意味着你也可以在節點上執行一個SET來更新數據庫中的值)當然,如果因爲任何其他原因作爲一個我能回來這裏:) 是對console example

的鏈接它返回與店名一排,和所有供應商轉移到它與每個供應商的百分

MATCH p =s<-[:MOVE_TO*]-sup 
WHERE HAS (sup.Supplier) AND NOT HAS (s.Supplier) 
WITH s,sup,reduce(totalSupplier = 0, r IN relationships(p)| totalSupplier + r.Quantity) AS TotalAmountMoved 
WITH sum(TotalAmountMoved) AS sumMoved, collect(DISTINCT ([sup.Supplier, TotalAmountMoved])) AS MyDataPart1,s 
WITH reduce(b=[], c IN MyDataPart1| b +[{ Supplier: c[0], Quantity: c[1], Percentile: ((c[1]*1.00))/(sumMoved*1.00)*100.00 }]) AS MyData, s, sumMoved 
RETURN s.Name, sumMoved, MyData 
+0

有趣。這是普遍的嗎?如果還有一個級別,它還能工作嗎? – 2015-01-27 06:35:05

+0

這應該與你想要的水平一樣多。當然,您也可以將其限制爲商店或供應商,方法是將篩選器添加到s或sup匹配 – cechode 2015-01-27 06:50:40

+1

不幸的是,此查詢使用的數學運算不正確。最終百分比的實際計算不包括將每個路徑中的所有數量加到'D01',然後使用總計作爲百分比計算的分母。相反,您必須計算每個Store節點的百分比,然後將每條路徑上的適當百分比相乘。我將創建一個能夠產生正確答案的答案(但它需要迭代調用)。 – cybersam 2015-01-29 07:49:59