我在Spark 2.1.0 GraphX上使用Scala。我有一個數組,如下所示:Spark GraphX - 如何傳遞和數組以過濾圖形邊緣?
scala> TEMP1Vertex.take(5)
res46: Array[org.apache.spark.graphx.VertexId] = Array(-1895512637, -1745667420, -1448961741, -1352361520, -1286348803)
如果我不得不篩選邊緣表的單個值,假設爲soruce ID -1895512637
val TEMP1Edge = graph.edges.filter { case Edge(src, dst, prop) => src == -1895512637}
scala> TEMP1Edge.take(5)
res52: Array[org.apache.spark.graphx.Edge[Int]] = Array(Edge(-1895512637,-2105158920,89), Edge(-1895512637,-2020727043,3), Edge(-1895512637,-1963423298,449), Edge(-1895512637,-1855207100,214), Edge(-1895512637,-1852287689,339))
scala> TEMP1Edge.count
17/04/03 10:20:31 WARN Executor: 1 block locks were not released by TID = 1436:[rdd_36_2]
res53: Long = 126
但是,當我通過其中包含的陣列一組獨特的IDS,代碼成功運行,但如下圖所示它不返回任何值:
scala> val TEMP1Edge = graph.edges.filter { case Edge(src, dst, prop) => src == TEMP1Vertex}
TEMP1Edge: org.apache.spark.rdd.RDD[org.apache.spark.graphx.Edge[Int]] = MapPartitionsRDD[929] at filter at <console>:56
scala> TEMP1Edge.take(5)
17/04/03 10:29:07 WARN Executor: 1 block locks were not released by TID = 1471:
[rdd_36_5]
res60: Array[org.apache.spark.graphx.Edge[Int]] = Array()
scala> TEMP1Edge.count
17/04/03 10:29:10 WARN Executor: 1 block locks were not released by TID = 1477:
[rdd_36_5]
res61: Long = 0
我不知道graphX什麼,但你可能謂詞總是返回'FALSE',因爲src'的'類型和'TEMP1Vertex'是不同的。你可能應該做一些像'Temp1Vertex.contains(src)'(雖然我不知道這樣的方法是否存在) –
我嘗試過'src == Traversable(TEMP1Vertex)'和'src == Iterable(TEMP1Vertex)'和雖然執行成功,但它們都不起作用。 – Nagesh
'=='不是強類型的,主要用於與java的互操作性,所以它會一直編譯。但是,如果您比較不同類型的對象,它將始終返回false(除非定義了特定的「equals」方法) –