2017-05-16 26 views
0

這裏是Scala的新開發者,也是Spark GraphX的新用戶。 到目前爲止,我真的很享受我的時間,但我剛剛有一個非常奇怪的錯誤。我已經將問題隔離爲長整型轉換,但它確實很奇怪。 另一個奇怪的是,它在Windows中正常工作,但在Linux中無法正常工作(創建無限循環)我在Linux中找到了問題的根源,但我不明白爲什麼會出現問題。我必須先將隨機數字放入一個變量中,然後才能正常工作。使用Spark GraphX時INT/LONG轉換的奇怪錯誤

你應該能夠複製/粘貼和執行整個事情

斯卡拉2.10.6,星火2.1.0,Linux操作系統Ubuntu 16.04

import org.apache.spark.{SparkConf, SparkContext} 
    import org.apache.spark.graphx._ 
    import scala.util.Random 

object Main extends App { 

    //Fonction template pour imprimer n'importe quel graphe 
    def printGraph[VD,ED] (g : Graph[VD,ED]): Unit = { 
    g.vertices.collect.foreach(println) 
    } 

    def randomNumber(limit : Int) = { 
    val start = 1 
    val end = limit 
    val rnd = new Random 
    start + rnd.nextInt((end - start) + 1) 
    } 

    val conf = new SparkConf() 
    .setAppName("Simple Application") 
    .setMaster("local[*]") 

    val sc = new SparkContext(conf) 
    sc.setLogLevel("ERROR") 

    val myVertices = sc.makeRDD(Array((1L, "A"), (2L, "B"), (3L, "C"), (4L, "D"), (5L, "E"), (6L, "F"))) 

    val myEdges = sc.makeRDD(Array(Edge(1L, 2L, ""), 
    Edge(1L, 3L, ""), Edge(1L, 6L, ""), Edge(2L, 3L, ""), 
    Edge(2L, 4L, ""), Edge(2L, 5L, ""), Edge(3L, 5L, ""), 
    Edge(4L, 6L, ""), Edge(5L, 6L, ""))) 

    val myGraph = Graph(myVertices, myEdges) 

    //Add a random color to each vertice. This random color is chosen from the total number of vertices 
    //Transform vertex attribute to color only 

    val bug = myVertices.count() 
    println("Long : " + bug) 
    val bugInt = bug.toInt 
    println("Int : " + bugInt) 

    //Problem is here when adding myGraph.vertices.count().toInt inside randomNumber. Works on Windows, infinite loop on Linux. 
    val g2 = myGraph.mapVertices((id, name ) => (randomNumber(myGraph.vertices.count().toInt))) 

//Rest of code removed 



} 

回答

2

不知道,如果你正在尋找一個解決方案或潛在的原因。 我認爲mapVertices方法正在干擾count(一個是轉換,一個是動作)。

該解決方案將是

val lim = myGraph.vertices.count().toInt 
val g2 = myGraph.mapVertices((id, name ) => (randomNumber(lim)))