2016-06-19 37 views
3

我在圖形數據庫,起源機場和目的地機場以及運營商中創建了3個節點。它們通過名爲'canceled_by'的屬性相關聯。如何在Neo4j中使用Cypher查詢group_by和計算百分比

MATCH (origin:origin_airport {name: row.ORIGIN}), 
    (destination:dest_airport {name: row.DEST}), 
    (carrier:Carrier {name: row.UNIQUE_CARRIER}) 
    CREATE (origin)-[:cancelled_by {cancellation: row.count}]->(carrier) 
    CREATE (origin)-[:cancelled_by {cancellation: row.count}]->(destination) 
    CREATE (origin)-[:operated_by {carrier: row.UNIQUE_CARRIER}]->(carrier) 

cancelled_by保存特定載波被取消的次數值。我的輸入文件將採用以下格式:

ORIGIN UNIQUE_CARRIER DEST Cancelled 
ABE DL    ATL 1 
ABE EV    ATL 1 
ABE EV    DTW 3 
ABE EV    ORD 3 
ABQ DL    DFW 2 
ABQ B6    JFK 2 

這裏我需要計算每個運營商的取消百分比。我如下預期結果:

UNIQUE_CARRIER DEST Percentage_Cancelled 
    DL     25% 
    EV     58.33% 
    B6     16.66% 

Example: Total number of cancellation = 12 
No of cancellation for DL = 3 
Percentage of cancellation for DL = (3/12)*100 = 25% 

下面的查詢給出了取消的每個載波的總和:

MATCH()-[ca:cancelled_by]->(c:Carrier) 
RETURN c.name AS Carrier, 
SUM(toFloat(ca.cancellation)) As sum 
ORDER BY sum DESC 
LIMIT 10 

我想下面的查詢計算百分比:

MATCH()-[ca:cancelled_by]->(c:Carrier) 
    WITH SUM(toFloat(ca.cancellation)) As total 
    MATCH()-[ca:cancelled_by]->(c:Carrier) 
    RETURN c.name AS Carrier, 
    (toFloat(ca.cancellation)/total)*100 AS percent 
    ORDER BY percent DESC 
    LIMIT 10 

但不是通過分組計算百分比,而是單獨計算百分比。

 Carrier sum 
     DL 0.36862408915559364 
     DL 0.34290612944706383 
     DL 0.3171881697385341 

如何根據group_by在Neo4j中使用密碼查詢計算百分比?

+0

http://coursera.org? –

+0

是。將註冊。同時,嘗試一下我自己的一些問題以適應它。您對上述問題有任何建議? –

回答

4

在分組時,您忘記了每個運營商的總和,並且不一定總是使用投射來浮動 - 僅在最後一次計算乘以浮點數時。

MATCH()-[ca:cancelled_by]->(:Carrier) 
    WITH SUM(ca.cancellation) As total 
    MATCH()-[ca:cancelled_by]->(c:Carrier) 
RETURN c.name AS Carrier, 
     100.0 * SUM(ca.cancellation)/total AS percent 
    ORDER BY percent DESC 
    LIMIT 10 
+0

非常感謝您的幫助。 –

0

嗨你可以嘗試使用Rdplyr包。 使用鏈接操作%>%以及函數 group_by,summarizetransmutegroup_bysummarize 會給你在每個組中取消的總和。使用 transmute函數來獲取相對頻率。

+0

我正在嘗試在Neo4j中使用密碼查詢 –

+0

哦,對不起。我看到了R標籤並想到了一個R解決方案。 –