2015-11-06 35 views
1

我在運行下面的代碼以在Spark graphX中創建圖形時出錯。我通過以下命令運行它通過火花殼: ./bin/spark-shell -i ex.scala使用邊緣/頂點輸入文件在GraphX中創建圖形時出錯

輸入:

My Vertex File looks like this (each line is a vertex of strings): 
word1,word2,word3 
word1,word2,word3 
... 
My Edge File looks like this: (edge from vertex 1 to vertex 2) 
1,2 
1,3 

代碼:

// Creating Vertex RDD (Input file has 300+ records with each record having list of strings separated by delimiter (,). 
//zipWithIndex done to get an index number for all the entries - basically numbering rows 
val vRDD: RDD[(VertexId, Array[String])] = (vfile.map(line => line.split(","))).zipWithIndex().map(line => (line._2, line._1)) 

// Creating Edge RDD using input file 
//val eRDD: RDD[Edge[Array[String]]] = (efile.map(line => line.split(","))) 

val eRDD: RDD[(VertexId, VertexId)] = efile.map(line => line.split(",")) 

// Graph creation 
val graph = Graph(vRDD, eRDD) 

錯誤:

Error: 
<console>:52: error: type mismatch; 
found : Array[String] 
required: org.apache.spark.graphx.Edge[Array[String]] 
      val eRDD: RDD[Edge[Array[String]]] = (efile.map(line => line.split(","))) 

<console>:57: error: type mismatch; 
found : org.apache.spark.rdd.RDD[(org.apache.spark.graphx.VertexId, org.apache.spark.graphx.VertexId)] 
required: org.apache.spark.rdd.RDD[org.apache.spark.graphx.Edge[?]] 
Error occurred in an application involving default arguments. 
     val graph = Graph(vRDD, eRDD) 
+0

你建立你的文件嗎?它抱怨從上面的代碼中已經註釋掉了'val eRDD:RDD [Edge [Array [String]]] =(efile.map(line => line.split(「,」)))'這一行。 .. –

+0

但除此之外,您的邊緣RDD需要是'RDD [Edge]'類型,而不是'VertexId'的元組(而BTW是「Long」而不是'String')。您應該閱讀文檔http://spark.apache.org/docs/latest/graphx-programming-guide.html –

回答

0

根據你給出的例子,我創建了兩個頂點和邊的文件:

val vfile = sc.textFile("vertices.txt") 
val efile = sc.textFile("edges.txt") 

然後創建您的頂點和邊RDDS:

val vRDD: RDD[(VertexId, Array[String])] = vfile.map(line => line.split(",")) 
           .zipWithIndex() 
           .map(_.swap) // you can use swap here instead of what you are actually doing. 

// Creating Edge RDD using input file 
val eRDD: RDD[Edge[(VertexId, VertexId)]] = efile.map(line => { 
    line.split(",", 2) match { 
    case Array(n1, n2) => Edge(n1.toLong, n2.toLong) 
    } 
}) 

一旦你創建你的頂點和邊RDDS,您現在可以創建您的圖表:

val graph = Graph(vRDD, eRDD) 
0

Edge有一個attr - 你的attr是什麼類型?讓我們假設這是一個Int,讓我們將其初始化爲零:

取而代之的是:

val eRDD: RDD[(VertexId, VertexId)] = efile.map(line => line.split(",")) 

試試這個:

val eRDD: RDD[Edge[Int]] = efile.map{ line => 
    val vs = line.split(","); 
    Edge(vs(0).toLong, vs(1).toLong, 0) 
} 
相關問題