2017-10-07 48 views
1

我試圖扭轉向圖以及每個頂點的鄰接表寫入一個文本文件中的格式斯卡拉寫圖中的每個節點的鄰接表到一個文本文件

NodeId \t NeighbourId1,NeighbourId2,...,NeighbourIdn 

所以到目前爲止,我只嘗試了打印我的輸出是如下:

(4,[[email protected]) 
(0,[[email protected]) 
(1,[[email protected]) 
(3,[[email protected]) 
(2,[[email protected]) 

儘管它應該是以下格式:

4 2 
0 4 
1 0,2 
3 1,2,3 
2 0,1 

我一直使用當前的代碼是

object Problem2{ 
def main(args: Array[String]){ 
val inputFile:String = args(0) 
val outputFolder = args(1) 
val conf = new SparkConf().setAppName("Problem2").setMaster("local") 
val sc = new SparkContext(conf) 

val graph = GraphLoader.edgeListFile(sc,inputFile) 
val edges = graph.reverse.edges 
val vertices = graph.vertices 
val newGraph = Graph(vertices,edges) 

val verticesWithSuccessors: VertexRDD[Array[VertexId]] = 
newGraph.ops.collectNeighborIds(EdgeDirection.Out) 

val successorGraph = Graph(verticesWithSuccessors, edges) 
val res = successorGraph.vertices.collect() 

val adjList = successorGraph.vertices.foreach(println) 

我不認爲mkString()可以用一個圖形對象使用做。圖形對象是否有類似的方法來獲取字符串?

回答

2

讓我們再次拿這個例子:一旦你有了這個

val vertices: RDD[(VertexId, String)] = 
    sc.parallelize(Array((1L,""), (2L,""), (4L,""), (6L,""))) 


val edges: RDD[Edge[String]] = 
    sc.parallelize(Array(
     Edge(1L, 2L, ""), 
     Edge(1L, 4L, ""), 
     Edge(1L, 6L, ""))) 
val inputGraph = Graph(vertices, edges) 

val verticesWithSuccessors: VertexRDD[Array[VertexId]] = 
    inputGraph.ops.collectNeighborIds(EdgeDirection.Out) 
val successorGraph = Graph(verticesWithSuccessors, edges) 

val adjList = successorGraph.vertices 

可以轉換成數據幀容易:

val df = adjList.toDF(Seq("node", "adjacents"): _*) 
df.show() 
+----+---------+ 
|node|adjacents| 
+----+---------+ 
| 1|[2, 4, 6]| 
| 2|  []| 
| 4|  []| 
| 6|  []| 
+----+---------+ 

現在很容易與改造列。這裏不那麼漂亮例如:

val result = df.rdd.collect().map(l=> l(0).asInstanceOf[Long] + "\t" + l(1).asInstanceOf[Seq[Long]].mkString(" ")) 
result.foreach(println(_)) 

1 2 4 6 
2 
4 
6 

或者你也可以使用UDF的嘗試或者你想處理的列。

希望這會有所幫助!

+1

爲了完整起見,轉換到數據幀之前,SQL上下文創建 'VAL sqlContext =新org.apache.spark.sql.SQLContext(SC)'' 進口sqlContext.implicits._' – Dee

相關問題