請注意:雖然這個問題提到Spark(2.1)我認爲這實際上是一個Scala(2.11)的問題,任何精通Scala開發人員將能夠回答它!Spark/Scala迭代器無法分配在foreach循環外定義的變量
我有下面的代碼,創建一個火花Dataset(基本上是二維表)和迭代它逐行。如果某行的username
列有「fizzbuzz」的值,那麼我想設置的迭代器之外定義一個變量並使用該變量的行迭代完成後:
val myDataset = sqlContext
.read
.format("org.apache.spark.sql.cassandra")
.options(Map("table" -> "mytable", "keyspace" -> "mykeyspace"))
.load()
var foobar : String
myDataset.collect().foreach(rec =>
if(rec.getAs("username") == "fizzbuzz") {
foobar = rec.getAs("foobarval")
}
)
if(foobar == null) {
throw new Exception("The fizzbuzz user was not found.")
}
當我運行此我得到但以下情況除外:
error: class $iw needs to be abstract, since:
it has 2 unimplemented members.
/** As seen from class $iw, the missing signatures are as follows.
* For convenience, these are usable as stub implementations.
*/
def foobar=(x$1: String): Unit = ???
class $iw extends Serializable {
^
有什麼特別的原因,我得到這個?