以下代碼行在1.6中工作得很好,但在2.0.2中失敗。任何想法,可能是什麼問題windows上的pyspark(從1.6升級到2.0.2):sqlContext.read.format失敗
file_name = "D:/ProgramFiles/spark-2.0.2-bin-hadoop2.3/data/mllib/sample_linear_regression_data.txt"
df_train = sqlContext.read.format("libsvm").load(file_name)
的錯誤是
File "<ipython-input-4-e5510d6d3d6a>", line 1, in <module>
df_train = sqlContext.read.format("libsvm").load("../data/mllib/sample_linear_regression_data.txt")
File "D:\ProgramFiles\spark-2.0.2-bin-hadoop2.3\python\lib\pyspark.zip\pyspark\sql\readwriter.py", line 147, in load
return self._df(self._jreader.load(path))
File "D:\ProgramFiles\spark-2.0.2-bin-hadoop2.3\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 1133, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "D:\ProgramFiles\spark-2.0.2-bin-hadoop2.3\python\lib\pyspark.zip\pyspark\sql\utils.py", line 79, in deco
raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
IllegalArgumentException: 'Can not create a Path from an empty string'
是您的路徑本地路徑? – MaFF
問題仍然存在於本地路徑中。總的來說,在窗口上PySpark和Spark非常不穩定。看起來像這些是專爲Linux – Shiv