2017-07-29 10 views
1

我有一個包含Spark的Docker鏡像。這裏是我的Dockerfile:PyCharm覆蓋正在用作解釋器的碼頭集裝箱中的PYTHONPATH

FROM docker-dev.artifactory.company.com/centos:7.3.1611 

# set proxy 
ENV http_proxy http://proxyaddr.co.uk:8080 
ENV HTTPS_PROXY http://proxyaddr.co.uk:8080 
ENV https_proxy http://proxyaddr.co.uk:8080 

RUN yum install -y epel-release 
RUN yum install -y gcc 
RUN yum install -y krb5-devel 
RUN yum install -y python-devel 
RUN yum install -y krb5-workstation 
RUN yum install -y python-setuptools 
RUN yum install -y python-pip 
RUN yum install -y xmlstarlet 
RUN yum install -y wget java-1.8.0-openjdk 
RUN pip install kerberos 
RUN pip install numpy 
RUN pip install pandas 
RUN pip install coverage 
RUN pip install tensorflow 
RUN wget http://d3kbcqa49mib13.cloudfront.net/spark-1.6.0-bin-hadoop2.6.tgz 
RUN tar -xvzf spark-1.6.0-bin-hadoop2.6.tgz -C /opt 
RUN ln -s spark-1.6.0-bin-hadoop2.6 /opt/spark 


ENV VERSION_NUMBER $(cat VERSION) 
ENV JAVA_HOME /etc/alternatives/jre/ 
ENV SPARK_HOME /opt/spark 
ENV PYTHONPATH $SPARK_HOME/python/:$PYTHONPATH 
ENV PYTHONPATH $SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH 

我可以建立然後運行搬運工圖像,連接到它,併成功導入pyspark庫:

$ docker run -d -it sse_spark_build:1.0 
09e8aac622d7500e147a6e6db69f806fe093b0399b98605c5da2ff5e0feca07c 
$ docker exec -it 09e8aac622d7 python 
Python 2.7.5 (default, Nov 6 2016, 00:28:07) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2 
Type "help", "copyright", "credits" or "license" for more information. 
>>> from pyspark import SparkContext 
>>>import os 
>>> os.environ['PYTHONPATH'] 
'/opt/spark/python/lib/py4j-0.9-src.zip:/opt/spark/python/:' 
>>> 

PYTHONPATH價值!

問題是,如果我使用與解釋器相同的泊塢窗圖像,PyCharm中的行爲是不同的。以下是我已經設置瞭解釋:

python interpreter setup

如果我再在PyCharm運行Python控制檯出現這種情況:

bec0b9189066:python /opt/.pycharm_helpers/pydev/pydevconsole.py 0 0 
PyDev console: starting. 
import sys; print('Python %s on %s' % (sys.version, sys.platform)) 
sys.path.extend(['/home/cengadmin/git/dhgitlab/sse/engine/fs/programs/pyspark', '/home/cengadmin/git/dhgitlab/sse/engine/fs/programs/pyspark']) 
Python 2.7.5 (default, Nov 6 2016, 00:28:07) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2 
import os 
os.environ['PYTHONPATH'] 
'/opt/.pycharm_helpers/pydev' 

正如你可以看到PyCharm改變PYTHONPATH這意味着我再也不能使用我想使用pyspark庫:

from pyspark import SparkContext 
Traceback (most recent call last): 
    File "<input>", line 1, in <module> 
ImportError: No module named pyspark 

OK,我可以從控制檯更改路徑,使其工作:

import sys 
sys.path.append('/opt/spark/python/') 
sys.path.append('/opt/spark/python/lib/py4j-0.9-src.zip') 

但每次打開控制檯時都必須這麼做。我不相信沒有辦法告訴PyCharm追加到PYTHONPATH而不是覆蓋它,但如果有的話我找不到它。任何人都可以提供建議嗎?我如何使用Docker鏡像作爲PyCharm的遠程解釋器並保留PYTHONPATH的值?

回答

1

您可以在首選項中設置它。請參閱下圖 Setting the environment setup

您可以設置環境變量或您更新啓動腳本部分。無論怎樣適合你更好的,既會做這項工作

又看了下面的文章,如果您需要進一步的幫助 https://www.jetbrains.com/help/pycharm/python-console.html

+0

哦非常酷,非常感謝你 – jamiet