0
我目前正在使用Apache spark。我想查看系統在文本文件上執行wordcount並將其存儲在文件中的時間。我需要使用bash腳本自動執行命令。我試圖運行下面的腳本: -在bash腳本中傳遞scala命令
start-all.sh
(time spark-shell
val inputfile = sc.textFile("/home/pi/Desktop/dataset/books_50.txt")
val counts = inputfile.flatMap(line => line.split(" ")).map(word => (word,1)).reduceByKey(_+_);
counts.toDebugString
counts.cache()
counts.saveAsTextFile("output")
exit()
) 2> /home/pi/Desktop/spark_output/test.txt
stop-all.sh
爲它顯示了以下錯誤: -
./wordcount_spark.sh: line 4: syntax error near unexpected token `('
./wordcount_spark.sh: line 4: ` val inputfile = sc.textFile("/home/pi/Desktop/dataset/books_50.txt")'
我試圖EOF添加到代碼中,我得到了以下錯誤: -
./wordcount_spark.sh: line 12: warning: here-document at line 3 delimited by end-of-file (wanted `EOF')
./wordcount_spark.sh: line 13: syntax error: unexpected end of file
我不明白如何通過斯卡拉通過bash腳本命令
用'spark-sehll -i'選項運行它 - 這裏是一個例子 - http://stackoverflow.com/questions/29928999/passing-command-line-arguments-to-spark-shell –