對不起。今天我想運行一個關於如何在Pyspark中用sqlContext創建DataFrame的程序。結果是一個AttributeError,它是「AttributeError:'NoneType'對象沒有屬性'sc'「 我的電腦是win7,spark的版本是1.6.0,API是python3。我有幾次谷歌並閱讀Spark Python API Docs,並且無法解決問題。所以我尋找你的幫幫我。AttributeError:'NoneType'對象沒有屬性'sc'
我的代碼是:
#python version is 3.5
sc.stop()
import pandas as pd
import numpy as np
sc=SparkContext("local","app1"
data2=[("a",5),("b",5),("a",5)]
df=sqlContext.createDataFrame(data2)
而結果是:
AttributeError Traceback (most recent call last)
<ipython-input-19-030b8faadb2c> in <module>()
5 data2=[("a",5),("b",5),("a",5)]
6 print(data2)
----> 7 df=sqlContext.createDataFrame(data2)
D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\sql\context.py in createDataFrame(self, data, schema, samplingRatio)
426 rdd, schema = self._createFromRDD(data, schema, samplingRatio)
427 else:
--> 428 rdd, schema = self._createFromLocal(data, schema)
429 jrdd = self._jvm.SerDeUtil.toJavaArray(rdd._to_java_object_rdd())
430 jdf = self._ssql_ctx.applySchemaToPythonRDD(jrdd.rdd(), schema.json())
D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\sql\context.py in _createFromLocal(self, data, schema)
358 # convert python objects to sql data
359 data = [schema.toInternal(row) for row in data]
--> 360 return self._sc.parallelize(data), schema
361
362 @since(1.3)
D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\context.py in parallelize(self, c, numSlices)
410 [[], [0], [], [2], [4]]
411 """
--> 412 numSlices = int(numSlices) if numSlices is not None else self.defaultParallelism
413 if isinstance(c, xrange):
414 size = len(c)
D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\context.py in defaultParallelism(self)
346 reduce tasks)
347 """
--> 348 return self._jsc.sc().defaultParallelism()
349
350 @property
AttributeError: 'NoneType' object has no attribute 'sc'
我很模糊化,我已經建立了事實上的 「SC」,爲什麼它顯示「'NoneType'對象的錯誤沒有屬性'sc'」?
你爲什麼要停止'SparkContext'('sc.stop()')? – 2016-11-28 10:31:06
如果你不添加sc.stop(),它會引發錯誤:'ValueError:不能同時運行多個SparkContexts;在D:\ Program Files \ Anaconda3 \ lib \ site-packages \ IPython \ utils \ py3compat.py:186'中創建了現有的SparkContext(app = PySparkShell,master = local [*])。 –
讓我改說一下。爲什麼你停止背景並創建一個新的背景。 – 2016-11-28 13:39:12