2016-08-03 76 views
0

我嘗試在兩個機器(節點3,node7)更新spark1.5.2到spark2.0.0,測試,我通過spark2.0.0提交任務/火花提交,但任務將在火花運行1.5.2由spark2.0.0提交任務,爲什麼運行spark版本1.5.2?

我得到的錯誤,當節點3提交任務

~/software/spark-2.0.0-bin-hadoop2.6/bin$ spark-submit --master mesos://192.168.1.5050 ../examples/src/main/python/pimy.py 

mesos executores標準錯誤日誌上node7

sh: 1: /home/jianxun/software/spark-1.5.2-bin-hadoop2.6/bin/spark-class: not found 

節點3:JDK:

openjdk version "1.8.0_91" 
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-3ubuntu1~15.10.1-b14) 
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode) 

節點3/etc/profile中

export M2_HOME=/usr/share/maven 
export M2=$M2_HOME/bin 
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 
export PATH=$JAVA_HOME/bin:$PATH 
export PATH=/home/jianxun/software/mongodb-linux-x86_64-3.2.0/bin:$PATH 
export HIVE_HOME=/home/jianxun/software/apache-hive-2.0.1-bin 
export PATH=$HIVE_HOME/bin:$PATH 
export CLASSPATH=$CLASSPATH:/usr/share/java/mysql.jar 
export SPARK_HOME=/home/jianxun/software/spark-2.0.0-bin-hadoop2.6 

node7:JDK:

openjdk version "1.8.0_91" 
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-3ubuntu1~15.10.1-b14) 
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode) 

節點3/etc/profile中

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 
export PATH=$JAVA_HOME/bin:$PATH 
export SPARK_HOME=/home/jianxun/software/spark-2.0.0-bin-hadoop2.6 
export PYTHONPATH=/usr/lib/python2.7 

Mesos VERSON是0.25, Mesos主是節點3,只有一個Mesos從站是node7。 節點3有兩個版本的火花:

  1. 〜/軟件/火花2.0.0彬hadoop2.6/
  2. 〜/軟件/火花1.5.2彬hadoop2.6/

火花配置在節點3:

spark-env.sh

export MESOS_NATIVE_JAVA_LIBRARY=/home/jianxun/software/mesos/lib/libmesos-0.25.0.so 
export SCALA_HOME=/usr/share/scala-2.11 
export SPARK_EXCUTOR_URI=/home/jianxun/software/spark-2.0.0-bin-hadoop2.6.tgz 

火花defaults.conf

spark.local.dir     /data/sparktmp 
spark.shuffle.service.enabled  true 
spark.mesos.coarse     true 
spark.executor.memory    24g 
spark.executor.cores    7 
spark.cores.max     7 
spark.executor.uri     /home/jianxun/software/spark-2.0.0-bin-hadoop2.6.tgz 

node7只有新版本的火花:

  1. 〜/軟件/火花2.0.0彬hadoop2.6/
  2. 〜/軟件/火花2.0.0斌-hadoop2.6.tgz(二進制文件)

火花提交日誌:(重要組成部分,弄清楚由****)

********************************************************* 
********************************************************* 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
SLF4J: Class path contains multiple SLF4J bindings. 
SLF4J: Found binding in [jar:file:/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/lib/spark-examples-1.5.2-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
SLF4J: Found binding in [jar:file:/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 
16/08/03 12:31:33 INFO SparkContext: Running Spark version 1.5.2 
***************************************************************** 
***************************************************************** 
16/08/03 12:31:34 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
16/08/03 12:31:34 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN). 
16/08/03 12:31:34 INFO SecurityManager: Changing view acls to: jianxun 
16/08/03 12:31:34 INFO SecurityManager: Changing modify acls to: jianxun 
16/08/03 12:31:34 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jianxun); users with modify permissions: Set(jianxun) 
16/08/03 12:31:34 INFO Slf4jLogger: Slf4jLogger started 
16/08/03 12:31:34 INFO Remoting: Starting remoting 
16/08/03 12:31:34 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:40978] 
16/08/03 12:31:34 INFO Utils: Successfully started service 'sparkDriver' on port 40978. 
16/08/03 12:31:34 INFO SparkEnv: Registering MapOutputTracker 
16/08/03 12:31:34 INFO SparkEnv: Registering BlockManagerMaster 
16/08/03 12:31:34 INFO DiskBlockManager: Created local directory at /data/sparktmp/blockmgr-76944d0c-de18-4f52-9249-8c3ca6141f59 
16/08/03 12:31:34 INFO MemoryStore: MemoryStore started with capacity 12.4 GB 
16/08/03 12:31:34 INFO HttpFileServer: HTTP File server directory is /data/sparktmp/spark-eba79d72-dd11-4d5d-a008-9964522fcc24/httpd-a64948d7-9e78-42f0-b711-84fc5f040517 
16/08/03 12:31:34 INFO HttpServer: Starting HTTP Server 
16/08/03 12:31:35 INFO Utils: Successfully started service 'HTTP file server' on port 35616. 
16/08/03 12:31:35 INFO SparkEnv: Registering OutputCommitCoordinator 
16/08/03 12:31:35 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
16/08/03 12:31:35 INFO SparkUI: Started SparkUI at http://192.168.1.203:4040 
16/08/03 12:31:35 INFO Utils: Copying /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py to /data/sparktmp/spark-eba79d72-dd11-4d5d-a008-9964522fcc24/userFiles-03a46142-7a44-43d0-82de-10c174721a99/pimy.py 
16/08/03 12:31:35 INFO SparkContext: Added file file:/home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py at http://192.168.1.203:35616/files/pimy.py with timestamp 1470198695252 
16/08/03 12:31:35 WARN SparkContext: Using SPARK_MEM to set amount of memory to use per executor process is deprecated, please use spark.executor.memory instead. 
16/08/03 12:31:35 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set. 
I0803 12:31:35.419636 32575 sched.cpp:164] Version: 0.25.0 
I0803 12:31:35.430359 32570 sched.cpp:262] New master detected at [email protected]:5050 
I0803 12:31:35.431447 32570 sched.cpp:272] No credentials provided. Attempting to register without authentication 
I0803 12:31:35.434844 32570 sched.cpp:641] Framework registered with ff2cf87e-3712-413f-a452-6d71430527bc-0012 
16/08/03 12:31:35 INFO MesosSchedulerBackend: Registered as framework ID ff2cf87e-3712-413f-a452-6d71430527bc-0012 
16/08/03 12:31:35 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41218. 
16/08/03 12:31:35 INFO NettyBlockTransferService: Server created on 41218 
16/08/03 12:31:35 INFO BlockManagerMaster: Trying to register BlockManager 
16/08/03 12:31:35 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.203:41218 with 12.4 GB RAM, BlockManagerId(driver, 192.168.1.203, 41218) 
16/08/03 12:31:35 INFO BlockManagerMaster: Registered BlockManager 
16/08/03 12:31:36 INFO SparkContext: Starting job: reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38 
16/08/03 12:31:36 INFO DAGScheduler: Got job 0 (reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38) with 2 output partitions 
16/08/03 12:31:36 INFO DAGScheduler: Final stage: ResultStage 0(reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38) 
16/08/03 12:31:36 INFO DAGScheduler: Parents of final stage: List() 
16/08/03 12:31:36 INFO DAGScheduler: Missing parents: List() 
16/08/03 12:31:36 INFO DAGScheduler: Submitting ResultStage 0 (PythonRDD[1] at reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38), which has no missing parents 
16/08/03 12:31:36 INFO MemoryStore: ensureFreeSpace(4272) called with curMem=0, maxMem=13335873454 
16/08/03 12:31:36 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 4.2 KB, free 12.4 GB) 
16/08/03 12:31:36 INFO MemoryStore: ensureFreeSpace(2792) called with curMem=4272, maxMem=13335873454 
.... 
.... 
16/08/03 12:31:37 INFO DAGScheduler: Job 0 failed: reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38, took 1.002633 s 
Traceback (most recent call last): 
    File "/home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py", line 38, in <module> 
    count = sc.parallelize(range(1, n + 1), partitions).map(f).reduce(add) 
    File "/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 799, in reduce 
    File "/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 773, in collect 
    File "/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__ 
    File "/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value 
py4j.protocol.Py4JJavaError16/08/03 12:31:37 INFO DAGScheduler: Executor lost: ff2cf87e-3712-413f-a452-6d71430527bc-S4 (epoch 3) 
: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. 
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 7, node7): ExecutorLostFailure (executor ff2cf87e-3712-413f-a452-6d71430527bc-S4lost) 
Driver stacktrace: 
     at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283) 
     at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271) 
     at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270) 
     at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
     at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) 
     at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270) 
     at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697) 
     at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697) 
     at scala.Option.foreach(Option.scala:236) 
     at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697) 
     at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496) 
     at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458) 
     at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447) 
     at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 
     at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567) 
     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824) 
     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837) 
     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1850) 
     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921) 
     at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:909) 
     at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) 
     at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108) 
     at org.apache.spark.rdd.RDD.withScope(RDD.scala:310) 
     at org.apache.spark.rdd.RDD.collect(RDD.scala:908) 
     at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:405) 
     at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala) 
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
     at java.lang.reflect.Method.invoke(Method.java:498) 
     at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) 
     at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) 
     at py4j.Gateway.invoke(Gateway.java:259) 
     at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) 
     at py4j.commands.CallCommand.execute(CallCommand.java:79) 
     at py4j.GatewayConnection.run(GatewayConnection.java:207) 
     at java.lang.Thread.run(Thread.java:745) 
16/08/03 12:31:37 INFO BlockManagerMasterEndpoint: Trying to remove executor ff2cf87e-3712-413f-a452-6d71430527bc-S4 from BlockManagerMaster. 
16/08/03 12:31:37 INFO BlockManagerMaster: Removed ff2cf87e-3712-413f-a452-6d71430527bc-S4 successfully in removeExecutor 
16/08/03 12:31:37 INFO DAGScheduler: Host added was in lost list earlier: node7 
16/08/03 12:31:37 INFO SparkContext: Invoking stop() from shutdown hook 
16/08/03 12:31:37 INFO SparkUI: Stopped Spark web UI at http://192.168.1.203:4040 
16/08/03 12:31:37 INFO DAGScheduler: Stopping DAGScheduler 
I0803 12:31:37.146209 32592 sched.cpp:1771] Asked to stop the driver 
I0803 12:31:37.146414 32573 sched.cpp:1040] Stopping framework 'ff2cf87e-3712-413f-a452-6d71430527bc-0012' 
16/08/03 12:31:37 INFO MesosSchedulerBackend: driver.run() returned with code DRIVER_STOPPED 
16/08/03 12:31:37 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 
16/08/03 12:31:37 INFO MemoryStore: MemoryStore cleared 
16/08/03 12:31:37 INFO BlockManager: BlockManager stopped 
16/08/03 12:31:37 INFO BlockManagerMaster: BlockManagerMaster stopped 
16/08/03 12:31:37 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 
16/08/03 12:31:37 INFO SparkContext: Successfully stopped SparkContext 
16/08/03 12:31:37 INFO ShutdownHookManager: Shutdown hook called 
16/08/03 12:31:37 INFO ShutdownHookManager: Deleting directory /data/sparktmp/spark-eba79d72-dd11-4d5d-a008-9964522fcc24/pyspark-02048aa7-deaf-4af5-adde-86732cd44324 
16/08/03 12:31:37 INFO ShutdownHookManager: Deleting directory /data/sparktmp/spark-eba79d72-dd11-4d5d-a008-9964522fcc24 

mesos.Warning登錄節點7

Log file created at: 2016/08/03 12:31:36 
Running on machine: node7 
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg 
W0803 12:31:36.408701 5686 containerizer.cpp:988] Ignoring update for unknown container: 9910a15a-ec96-4e5a-91b9-58652b2bcaa5 
W0803 12:31:36.409050 5686 containerizer.cpp:988] Ignoring update for unknown container: 9910a15a-ec96-4e5a-91b9-58652b2bcaa5 
W0803 12:31:36.613108 5687 containerizer.cpp:988] Ignoring update for unknown container: 108436bb-429b-4214-9d9b-9fa452383093 
W0803 12:31:36.613817 5691 containerizer.cpp:988] Ignoring update for unknown container: 108436bb-429b-4214-9d9b-9fa452383093 
W0803 12:31:36.807909 5692 containerizer.cpp:988] Ignoring update for unknown container: 5c9abbdb-ee6a-4175-8087-d6d1dd1bd5ea 
W0803 12:31:36.808281 5692 containerizer.cpp:988] Ignoring update for unknown container: 5c9abbdb-ee6a-4175-8087-d6d1dd1bd5ea 
W0803 12:31:37.019579 5687 containerizer.cpp:988] Ignoring update for unknown container: 7a11174e-7774-453c-bdf7-5cbb5b4afcfa 
W0803 12:31:37.020051 5693 containerizer.cpp:988] Ignoring update for unknown container: 7a11174e-7774-453c-bdf7-5cbb5b4afcfa 
W0803 12:31:37.142438 5690 slave.cpp:1995] Cannot shut down unknown framework ff2cf87e-3712-413f-a452-6d71430527bc-0012 

回答

0

執行您的/ etc/profile文件。

+0

謝謝編輯SPARK_HOME後我沒有輸入/ etc/proflie。之後它來源。 Spark運行右版verison –

+0

好的,我會牢記這一點。 –

相關問題