運行時pyspark
1.6.X它出現就好了。由於配置單元存在Metastore連接問題,無法運行pyspark 2.X
17/02/25 17:35:41 INFO storage.BlockManagerMaster: Registered BlockManager
Welcome to
____ __
/__/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__/.__/\_,_/_/ /_/\_\ version 1.6.1
/_/
Using Python version 2.7.13 (default, Dec 17 2016 23:03:43)
SparkContext available as sc, SQLContext available as sqlContext.
>>>
但我重置SPARK_HOME
,PYTHONPATH
和PATH
後指向火花2.x的安裝,事情南下很快
(一)我必須手動刪除每次德比metastore_db
。
(B)pyspark
沒有啓動:
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
NOTE: SPARK_PREPEND_CLASSES is set, placing locally compiled Spark classes ahead of assembly.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/02/25 17:32:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/02/25 17:32:53 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
17/02/25 17:32:53 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
我不需要/護理hive
功能:它打印這些不愉快的警告後掛起,但它很可能是他們中的情況下,需要火花2.X. hive
最簡單的工作配置是什麼使pyspark 2.X
高興?
有警告是好的,他們只是說創建空的metastore。你在「SPARK_PREPEND_CLASSES」中附加了哪些圖書館?當pyspark初始化掛起時,你可以附加spark jvm進程的線程轉儲嗎? – Mariusz
你有沒有試過['enableHiveSupport'](http://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.SparkSession.Builder.enableHiveSupport)函數?即使我沒有訪問Hive,我在從1.6遷移到2.x時也遇到了DataFrame問題。在構建器上調用該函數解決了我的問題。 (您也可以將它添加到配置中。) – santon
@santon請做出答案:我確實有一些後續問題,但希望從授予信用開始 – javadba