假設這是您的測試數據:
import org.graphframes.GraphFrame
val edgesDf = spark.sqlContext.createDataFrame(Seq(
("a", "b", "Mother"),
("b", "c", "Father"),
("d", "c", "Father"),
("e", "b", "Mother")
)).toDF("src", "dst", "relationship")
val graph = GraphFrame.fromEdges(edgesDf)
graph.edges.show()
+---+---+------------+
|src|dst|relationship|
+---+---+------------+
| a| b| Mother|
| b| c| Father|
| d| c| Father|
| e| b| Mother|
+---+---+------------+
您可以使用一個主題的查詢和過濾器適用於它:
graph.find("()-[e]->()").filter("e.relationship = 'Mother'").show()
+------------+
| e|
+------------+
|[a,b,Mother]|
|[e,b,Mother]|
+------------+
或者,因爲你的情況是比較簡單的,你可以申請一個過濾器到圖的邊緣:
graph.edges.filter("relationship = 'Mother'").show()
+---+---+------------+
|src|dst|relationship|
+---+---+------------+
| a| b| Mother|
| e| b| Mother|
+---+---+------------+
這裏有一些替代語法(每個獲得與立即在上面):
graph.edges.filter($"relationship" === "Mother").show()
graph.edges.filter('relationship === "Mother").show()
你提到過濾方向,但每個關係的方向編碼在圖本身(即從源到目的地)。