2016-04-25 108 views
1

使用python 2.7的Pyspark工作正常。 我安裝的Python 3.5.1(從源代碼安裝) ,當我在終端使用python導入pyspark錯誤Pyspark 3.5.1

Python 3.5.1 (default, Apr 25 2016, 12:41:28) 
[GCC 4.8.4] on linux 
Type "help", "copyright", "credits" or "license" for more information. 
Traceback (most recent call last): 
    File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/shell.py", line 30, in <module> 
    import pyspark 
    File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/__init__.py", line 41, in <module> 
    from pyspark.context import SparkContext 
    File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/context.py", line 28, in <module> 
    from pyspark import accumulators 
    File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/accumulators.py", line 98, in <module> 
    from pyspark.serializers import read_int, PickleSerializer 
    File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/serializers.py", line 58, in <module> 
    import zlib 
ImportError: No module named 'zlib' 

我試過蟒蛇3.4.3,也能正常工作

回答

0

你進行檢查,以運行pyspark我得到這個錯誤確定zlib實際上在你的python安裝中?它應該是默認的,但奇怪的事情發生。

0

你在.bashrc文件中提供了系統python3.5.1到「PYSPARK_PYTHON」的確切路徑嗎?

Welcome to 
    ____    __ 
/__/__ ___ _____/ /__ 
_\ \/ _ \/ _ `/ __/ '_/ 
/__/.__/\_,_/_/ /_/\_\ version 2.1.1 
    /_/ 

Using Python version 3.6.1 (default, Jun 23 2017 16:20:09) 
SparkSession available as 'spark'. 

這就是我的PySpark提示符顯示的內容。阿帕奇星火verison是2.1.1

PS:我用Anaconda3(Python的3.6.1)對我的日常PySpark碼我PYSPARK_DRIVER設置爲 'jupyter'

上面的例子是用我的系統默認的Python 3.6