2017-08-10 267 views
0

對於外行人的另一個問題:斯卡拉斯卡拉 - 但類RDD是不變的類型T

兩個似乎相同但不相同的RDD。具體如下:

val rdd0 = sc.parallelize(List("a", "b", "c", "d", "e")) 
val rdd1 = rdd0.map(x => (x, 110 - x.toCharArray()(0).toByte)) 
val rdd2 = sc.parallelize(List(("c", 2), ("d, 2)", ("e", 2), ("f", 2)))) 
//Seemingly the same type but not, how practically to get them to be UNIONed? 
val rddunion = rdd1.union(rdd2).collect() 

得到這樣的:

<console>:182: error: type mismatch; 
found : org.apache.spark.rdd.RDD[Product with Serializable] 
required: org.apache.spark.rdd.RDD[(String, Int)] 
Note: Product with Serializable >: (String, Int), but class RDD is invariant in type T. 
You may wish to define T as -T instead. (SLS 4.5) 
    val rddunion = rdd1.union(rdd2).collect() 
          ^

如何得到這個對於新手工作。我現在可以看到爲什麼人們對Scala有點猶豫。閱讀一些文檔,但不完全清楚。如何讓這個RDD聯合工作?

非常感謝。

+0

謝謝,我真的很喜歡減價! – thebluephantom

回答

3

你在錯誤的地方寫"("d, 2)"

所以不是

val rdd2 = sc.parallelize(List(("c", 2), ("d, 2)", ("e", 2), ("f", 2)))) 

正確的一個

val rdd2 = sc.parallelize(List(("c", 2), ("d", 2), ("e", 2), ("f", 2)))