1
我在Eclipse IDE中使用Pyspark進行編程,並試圖轉換到Spark 1.4.1,以便最終可以使用Python 3進行編程。下面的程序工作在星火1.3.1但在星火1.4.1拋出異常:Spark 1.4.1 py4j.Py4JException:方法讀取([])不存在
from pyspark import SparkContext, SparkConf
from pyspark.sql.types import *
from pyspark.sql import SQLContext
if __name__ == '__main__':
conf = SparkConf().setAppName("MyApp").setMaster("local")
global sc
sc = SparkContext(conf=conf)
global sqlc
sqlc = SQLContext(sc)
symbolsPath = 'SP500Industry.json'
symbolsRDD = sqlc.read.json(symbolsPath)
print "Done""
我得到的回溯如下:
Traceback (most recent call last):
File "/media/gavin/20A6-76BF/Current Projects Luna/PySpark Test/Test.py", line 21, in <module>
symbolsRDD = sqlc.read.json(symbolsPath) #rdd with all symbols (and their industries
File "/home/gavin/spark-1.4.1-bin-hadoop2.6/python/pyspark/sql/context.py", line 582, in read
return DataFrameReader(self)
File "/home/gavin/spark-1.4.1-bin-hadoop2.6/python/pyspark/sql/readwriter.py", line 39, in __init__
self._jreader = sqlContext._ssql_ctx.read()
File "/home/gavin/spark-1.4.1-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
File "/home/gavin/spark-1.4.1-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 304, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling o18.read. Trace:
py4j.Py4JException: Method read([]) does not exist
at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:333)
at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342)
at py4j.Gateway.invoke(Gateway.java:252)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)"
我對外部庫項目是 ... spark-1.4.1-bin-hadoop2.6/python ... spa rk-1.4.1-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip ... spark-1.4.1-bin-hadoop2.6/python/lib/pyspark.zip(已嘗試包括和不包括這個)
有人能幫我解決我做錯了什麼嗎?
我得到與OP中提到的完全相同的錯誤和您的調整。感謝您的幫助。 –