2015-01-06 68 views
4

我可以成功運行pi版本的java版本,如下所示。以紗線客戶端模式運行pi.py示例時Spark失敗

./bin/spark-submit --class org.apache.spark.examples.SparkPi \ 
    --master yarn-client \ 
    --num-executors 3 \ 
    --driver-memory 4g \ 
    --executor-memory 2g \ 
    --executor-cores 1 \ 
    --queue thequeue \ 
    lib/spark-examples*.jar \ 
    10 

但是,python版本失敗並顯示以下錯誤信息。我使用紗線客戶端模式。帶紗線客戶端模式的pyspark命令行返回相同的信息。任何人都可以幫我弄清楚這個問題嗎?

[email protected]:~/spark$ ./bin/spark-submit --master yarn-client examples/src/main/python/pi.py 
15/01/05 17:22:26 INFO spark.SecurityManager: Changing view acls to: nlp 
15/01/05 17:22:26 INFO spark.SecurityManager: Changing modify acls to: nlp 
15/01/05 17:22:26 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nlp); users with modify permissions: Set(nlp) 
15/01/05 17:22:26 INFO slf4j.Slf4jLogger: Slf4jLogger started 
15/01/05 17:22:26 INFO Remoting: Starting remoting 
15/01/05 17:22:26 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:42747] 
15/01/05 17:22:26 INFO util.Utils: Successfully started service 'sparkDriver' on port 42747. 
15/01/05 17:22:26 INFO spark.SparkEnv: Registering MapOutputTracker 
15/01/05 17:22:26 INFO spark.SparkEnv: Registering BlockManagerMaster 
15/01/05 17:22:26 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20150105172226-aeae 
15/01/05 17:22:26 INFO storage.MemoryStore: MemoryStore started with capacity 265.1 MB 
15/01/05 17:22:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
15/01/05 17:22:27 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-cbe0079b-79c5-426b-b67e-548805423b11 
15/01/05 17:22:27 INFO spark.HttpServer: Starting HTTP Server 
15/01/05 17:22:27 INFO server.Server: jetty-8.y.z-SNAPSHOT 
15/01/05 17:22:27 INFO server.AbstractConnector: Started [email protected]:57169 
15/01/05 17:22:27 INFO util.Utils: Successfully started service 'HTTP file server' on port 57169. 
15/01/05 17:22:27 INFO server.Server: jetty-8.y.z-SNAPSHOT 
15/01/05 17:22:27 INFO server.AbstractConnector: Started [email protected]:4040 
15/01/05 17:22:27 INFO util.Utils: Successfully started service 'SparkUI' on port 4040. 
15/01/05 17:22:27 INFO ui.SparkUI: Started SparkUI at http://yyy2:4040 
15/01/05 17:22:27 INFO client.RMProxy: Connecting to ResourceManager at yyy14/10.112.168.195:8032 
15/01/05 17:22:27 INFO yarn.Client: Requesting a new application from cluster with 6 NodeManagers 
15/01/05 17:22:27 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 
15/01/05 17:22:27 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 
15/01/05 17:22:27 INFO yarn.Client: Setting up container launch context for our AM 
15/01/05 17:22:27 INFO yarn.Client: Preparing resources for our AM container 
15/01/05 17:22:28 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 24 for xxx on ha-hdfs:hzdm-cluster1 
15/01/05 17:22:28 INFO yarn.Client: Uploading resource file:/home/nlp/platform/spark-1.2.0-bin-2.5.2/lib/spark-assembly-1.2.0-hadoop2.5.2.jar -> hdfs://hzdm-cluster1/user/nlp/.sparkStaging/application_1420444011562_0023/spark-assembly-1.2.0-hadoop2.5.2.jar 
15/01/05 17:22:29 INFO yarn.Client: Uploading resource file:/home/nlp/platform/spark-1.2.0-bin-2.5.2/examples/src/main/python/pi.py -> hdfs://hzdm-cluster1/user/nlp/.sparkStaging/application_1420444011562_0023/pi.py 
15/01/05 17:22:29 INFO yarn.Client: Setting up the launch environment for our AM container 
15/01/05 17:22:29 INFO spark.SecurityManager: Changing view acls to: nlp 
15/01/05 17:22:29 INFO spark.SecurityManager: Changing modify acls to: nlp 
15/01/05 17:22:29 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nlp); users with modify permissions: Set(nlp) 
15/01/05 17:22:29 INFO yarn.Client: Submitting application 23 to ResourceManager 
15/01/05 17:22:30 INFO impl.YarnClientImpl: Submitted application application_1420444011562_0023 
15/01/05 17:22:31 INFO yarn.Client: Application report for application_1420444011562_0023 (state: ACCEPTED) 
15/01/05 17:22:31 INFO yarn.Client: 
     client token: Token { kind: YARN_CLIENT_TOKEN, service: } 
     diagnostics: N/A 
     ApplicationMaster host: N/A 
     ApplicationMaster RPC port: -1 
     queue: root.default 
     start time: 1420449749969 
     final status: UNDEFINED 
     tracking URL: http://yyy14:8070/proxy/application_1420444011562_0023/ 
     user: nlp 
15/01/05 17:22:32 INFO yarn.Client: Application report for application_1420444011562_0023 (state: ACCEPTED) 
15/01/05 17:22:33 INFO yarn.Client: Application report for application_1420444011562_0023 (state: ACCEPTED) 
15/01/05 17:22:34 INFO yarn.Client: Application report for application_1420444011562_0023 (state: ACCEPTED) 
15/01/05 17:22:35 INFO yarn.Client: Application report for application_1420444011562_0023 (state: ACCEPTED) 
15/01/05 17:22:36 INFO yarn.Client: Application report for application_1420444011562_0023 (state: ACCEPTED) 
15/01/05 17:22:36 INFO cluster.YarnClientSchedulerBackend: ApplicationMaster registered as Actor[akka.tcp://[email protected]:52855/user/YarnAM#435880073] 
15/01/05 17:22:36 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> yyy14, PROXY_URI_BASES -> http://yyy14:8070/proxy/application_1420444011562_0023), /proxy/application_1420444011562_0023 
15/01/05 17:22:36 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 
15/01/05 17:22:37 INFO yarn.Client: Application report for application_1420444011562_0023 (state: RUNNING) 
15/01/05 17:22:37 INFO yarn.Client: 
     client token: Token { kind: YARN_CLIENT_TOKEN, service: } 
     diagnostics: N/A 
     ApplicationMaster host: yyy16 
     ApplicationMaster RPC port: 0 
     queue: root.default 
     start time: 1420449749969 
     final status: UNDEFINED 
     tracking URL: http://yyy14:8070/proxy/application_1420444011562_0023/ 
     user: nlp 
15/01/05 17:22:37 INFO cluster.YarnClientSchedulerBackend: Application application_1420444011562_0023 has started running. 
15/01/05 17:22:37 INFO netty.NettyBlockTransferService: Server created on 35648 
15/01/05 17:22:37 INFO storage.BlockManagerMaster: Trying to register BlockManager 
15/01/05 17:22:37 INFO storage.BlockManagerMasterActor: Registering block manager yyy2:35648 with 265.1 MB RAM, BlockManagerId(<driver>, yyy2, 35648) 
15/01/05 17:22:37 INFO storage.BlockManagerMaster: Registered BlockManager 
15/01/05 17:22:37 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://[email protected]:52855] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. 
15/01/05 17:22:38 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED! 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/kill,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/static,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/threadDump/json,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/threadDump,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/json,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment/json,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd/json,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/json,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool/json,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/json,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/json,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/job/json,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/job,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/json,null} 
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs,null} 
15/01/05 17:22:38 INFO ui.SparkUI: Stopped Spark web UI at http://yyy2:4040 
15/01/05 17:22:38 INFO scheduler.DAGScheduler: Stopping DAGScheduler 
15/01/05 17:22:38 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors 
15/01/05 17:22:38 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down 
15/01/05 17:22:38 INFO cluster.YarnClientSchedulerBackend: Stopped 
15/01/05 17:22:39 INFO spark.MapOutputTrackerMasterActor: MapOutputTrackerActor stopped! 
15/01/05 17:22:39 INFO storage.MemoryStore: MemoryStore cleared 
15/01/05 17:22:39 INFO storage.BlockManager: BlockManager stopped 
15/01/05 17:22:39 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 
15/01/05 17:22:39 INFO spark.SparkContext: Successfully stopped SparkContext 
15/01/05 17:22:39 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 
15/01/05 17:22:39 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 
15/01/05 17:22:39 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down. 
15/01/05 17:22:57 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms) 
Traceback (most recent call last): 
    File "/home/nlp/platform/spark-1.2.0-bin-2.5.2/examples/src/main/python/pi.py", line 29, in <module> 
    sc = SparkContext(appName="PythonPi") 
    File "/home/nlp/spark/python/pyspark/context.py", line 105, in __init__ 
    conf, jsc) 
    File "/home/nlp/spark/python/pyspark/context.py", line 153, in _do_init 
    self._jsc = jsc or self._initialize_context(self._conf._jconf) 
    File "/home/nlp/spark/python/pyspark/context.py", line 201, in _initialize_context 
    return self._jvm.JavaSparkContext(jconf) 
    File "/home/nlp/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 701, in __call__ 
    File "/home/nlp/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value 
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. 
: java.lang.NullPointerException 
     at org.apache.spark.SparkContext.<init>(SparkContext.scala:497) 
     at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) 
     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
     at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
     at java.lang.reflect.Constructor.newInstance(Constructor.java:408) 
     at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) 
     at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) 
     at py4j.Gateway.invoke(Gateway.java:214) 
     at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) 
     at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) 
     at py4j.GatewayConnection.run(GatewayConnection.java:207) 
     at java.lang.Thread.run(Thread.java:745) 

回答

1

我經歷用火花提交和紗線的客戶端(我得到了相同的NPE /堆棧跟蹤)類似的問題。調整我的記憶設置的伎倆。當你嘗試分配太多內存時,它似乎會失敗。我將首先刪除--executor-memory--driver-memory開關。

+0

我有同樣的問題,但你能告訴我如何以及在哪裏刪除--executor內存和--driver-memory開關?我是Python/Spark/cmd的新手,所以一步一步的指導將會很受讚賞。 – ElinaJ

+0

如果你看看OP如何執行'spark-submit',你會發現他指定了'--executor-memory 2g'和'--driver-memory 4g'。如果省略這些命令行標誌,則每個標誌默認使用512m。這符合我的建議/答案來調低你的記憶設置。 – liggysmalls

2

嘗試用部署模式的參數,如:

--deploy-mode cluster 

我有問題喜歡你,用這個參數,它的工作。

+1

如果運行'spark-shell -master yarn-client --deploy-mode cluster',那麼'--master yarn-client'不能使用'你會得到這個錯誤'錯誤:羣集部署模式與master不兼容「yarn-client」' – lockwobr

+0

@lockwobr是的,因爲'--master yarn-client'是'--master yarn --deploy-mode client'的快捷方式,所以你不能將它與'--deploy模式集羣「。您的上下文解決方案:用'' - 主紗線羣集'替換' - 主紗線客戶端' – Murmel

0

我就遇到了這個問題,正在運行(HDP 2.3火花1.3.1)

spark-shell 
--master yarn-client 
--driver-memory 4g 
--executor-memory 4g 
--executor-cores 1 
--num-executors 4 

解決方案對我來說是設定火花配置值:

spark.yarn.am.extraJavaOptions=-Dhdp.version=2.3.0.0-2557 
1

我在降低核心數量先進的spark-env使其工作。

+2

雖然這個信息可能有助於解決這個問題,但提供 關於_why_和/或_how_的附加上下文,回答 問題會顯着提高其長期的價值。請[編輯]你的答案,添加一些解釋。 –

5

如果你正在運行於Java 8這個例子中,這可能是由於對Java 8的內存過度分配策略:https://issues.apache.org/jira/browse/YARN-4714

您可以強制紗由紗現場設置以下屬性忽略這一點。 xml

<property> 
    <name>yarn.nodemanager.pmem-check-enabled</name> 
    <value>false</value> 
</property> 

<property> 
    <name>yarn.nodemanager.vmem-check-enabled</name> 
    <value>false</value> 
</property> 
+1

Spark 2.0.1,Scala 2.11.8,Hadoop 2.7.3工作。 – xring

相關問題