2017-06-14 125 views
1

我在與Spark集羣連接時遇到問題。 我的應用程序(驅動程序)運行在雲上運行的本地env和spark集羣上。如果我的應用程序啓動,它與主連接成功,但無法與執行程序連接。我認爲這是網絡問題,如acl。我無法解決它。Spark未能與執行程序連接

請幫幫我。

這是錯誤日誌

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
 
17/06/14 18:57:25 INFO CoarseGrainedExecutorBackend: Started daemon with process name: [email protected] 
 
17/06/14 18:57:25 INFO SignalUtils: Registered signal handler for TERM 
 
17/06/14 18:57:25 INFO SignalUtils: Registered signal handler for HUP 
 
17/06/14 18:57:25 INFO SignalUtils: Registered signal handler for INT 
 
17/06/14 18:57:26 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
 
17/06/14 18:57:26 INFO SecurityManager: Changing view acls to: irteam,dongyoung 
 
17/06/14 18:57:26 INFO SecurityManager: Changing modify acls to: irteam,dongyoung 
 
17/06/14 18:57:26 INFO SecurityManager: Changing view acls groups to: 
 
17/06/14 18:57:26 INFO SecurityManager: Changing modify acls groups to: 
 
17/06/14 18:57:26 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(irteam, dongyoung); groups with view permissions: Set(); users with modify permissions: Set(irteam, dongyoung); groups with modify permissions: Set() 
 
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException 
 
\t at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) 
 
\t at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:70) 
 
\t at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:174) 
 
\t at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:270) 
 
\t at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) 
 
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult 
 
\t at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77) 
 
\t at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75) 
 
\t at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) 
 
\t at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) 
 
\t at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) 
 
\t at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) 
 
\t at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83) 
 
\t at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:88) 
 
\t at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:188) 
 
\t at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:71) 
 
\t at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:70) 
 
\t at java.security.AccessController.doPrivileged(Native Method) 
 
\t at javax.security.auth.Subject.doAs(Subject.java:422) 
 
\t at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656) 
 
\t ... 4 more 
 
Caused by: java.io.IOException: Failed to connect to /10.70.22.192:59291 
 
\t at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228) 
 
\t at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179) 
 
\t at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197) 
 
\t at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:191) 
 
\t at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187) 
 
\t at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
 
\t at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
 
\t at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
 
\t at java.lang.Thread.run(Thread.java:748) 
 
Caused by: java.net.ConnectException: Connection timed out: /10.70.22.192:59291 
 
\t at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
 
\t at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
 
\t at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224) 
 
\t at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289) 
 
\t at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528) 
 
\t at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) 
 
\t at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) 
 
\t at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) 
 
\t at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) 
 
\t ... 1 more

`

+0

做ping工程到10.70.22.192? – BDR

+0

10.70.22.192是我的本地ip地址 –

回答

0

這是用戶權限問題,至少日誌是這麼說的。

您應該使用可訪問羣集的用戶標識從本地驅動程序節點啓動Spark任務。

使用hdfs/spark級別的用戶觸發您的工作。

+0

我這麼認爲。我如何設置有權訪問我的本地PC上的羣集的用戶ID? –

+0

您應該與系統中的羣集具有相同的user:group。嘗試在'cluster'部署模式下提交spark工作。 (而不是'client'模式) –

+0

我的應用程序如何在集羣模式下運行而不是通過spark-submit腳本運行? –

相關問題