我如何看到SparkContext是否有內容執行,什麼時候一切都完成了我停止它?因爲目前我等待30秒才能調用SparkContext.stop,否則我的應用程序會拋出錯誤。如何等待SparkContext完成所有過程?
import org.apache.log4j.Level
import org.apache.log4j.Logger
import org.apache.spark.SparkContext
object RatingsCounter extends App {
// set the log level to print only errors
Logger.getLogger("org").setLevel(Level.ERROR)
// create a SparkContext using every core of the local machine, named RatingsCounter
val sc = new SparkContext("local[*]", "RatingsCounter")
// load up each line of the ratings data into an RDD (Resilient Distributed Dataset)
val lines = sc.textFile("src/main/resource/u.data", 0)
// convert each line to s string, split it out by tabs and extract the third field.
// The file format is userID, movieID, rating, timestamp
val ratings = lines.map(x => x.toString().split("\t")(2))
// count up how many times each value occurs
val results = ratings.countByValue()
// sort the resulting map of (rating, count) tuples
val sortedResults = results.toSeq.sortBy(_._1)
// print each result on its own line.
sortedResults.foreach { case (key, value) => println("movie ID: " + key + " - rating times: " + value) }
Thread.sleep(30000)
sc.stop()
}
斯卡拉= 2.11.8和火花= 1.6.1 –
你可以分享你把你的主函數中的對象? – eliasah
你可以嘗試,而不是擴展應用程序def主和第二個參數1 sc.textfile –