2014-07-04 48 views
0

無法運行星火1.0 SparkPi我陷在問題與乳寧火花PI例如在HDP 2.0上HDP 2.0

我下載了火花1.0預先建立從http://spark.apache.org/downloads.html(用於HDP2) 從火花塞的網站運行示例:

./bin/spark-submit --class org.apache.spark.examples.SparkPi  --master yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory 2g --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2 

我得到錯誤:

Application application_1404470405736_0044 failed 3 times due to AM Container for appattempt_1404470405736_0044_000003 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) .Failing this attempt.. Failing the application.

Unknown/unsupported param List(--executor-memory, 2048, --executor-cores, 1, --num-executors, 3) Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options] Options:
--jar JAR_PATH Path to your application's JAR file (required) --class CLASS_NAME Name of your application's main class (required) ...bla-bla-bla

什麼想法?我怎樣才能使它工作?

+0

我覺得很明顯你沒有傳遞參數co正確地說,'未知/不支持的參數列表( - executor-memory,2048,--executor-cores,1,--num-executors,3)''我推薦查看' .bla-BLA-bla' – aaronman

回答

3

我有同樣的問題。 原因是那個版本的spark-assembly.jar,在hdfs 不同於你目前的spark版本。

例如PARAMS在HDFS版本org.apache.spark.deploy.yarn.Client的名單:

$ hadoop jar ./spark-assembly.jar org.apache.spark.deploy.yarn.Client --help 
Usage: org.apache.spark.deploy.yarn.Client [options] 
Options: 
    --jar JAR_PATH    Path to your application's JAR file (required in yarn-cluster mode) 
    --class CLASS_NAME   Name of your application's main class (required) 
    --args ARGS    Arguments to be passed to your application's main class. 
          Mutliple invocations are possible, each will be passed in order. 
    --num-workers NUM   Number of workers to start (Default: 2) 
    --worker-cores NUM   Number of cores for the workers (Default: 1). This is unsused right now. 
    --master-class CLASS_NAME Class Name for Master (Default: spark.deploy.yarn.ApplicationMaster) 
    --master-memory MEM  Memory for Master (e.g. 1000M, 2G) (Default: 512 Mb) 
    --worker-memory MEM  Memory per Worker (e.g. 1000M, 2G) (Default: 1G) 
    --name NAME    The name of your application (Default: Spark) 
    --queue QUEUE    The hadoop queue to use for allocation requests (Default: 'default') 
    --addJars jars    Comma separated list of local jars that want SparkContext.addJar to work with. 
    --files files    Comma separated list of files to be distributed with the job. 
    --archives archives  Comma separated list of archives to be distributed with the job. 

而對於安裝最新的火花裝配jar文件相同的幫助:

$ hadoop jar ./spark-assembly-1.0.0-cdh5.1.0-hadoop2.3.0-cdh5.1.0.jar org.apache.spark.deploy.yarn.Client 
Usage: org.apache.spark.deploy.yarn.Client [options] 
Options: 
    --jar JAR_PATH    Path to your application's JAR file (required in yarn-cluster mode) 
    --class CLASS_NAME   Name of your application's main class (required) 
    --arg ARGS     Argument to be passed to your application's main class. 
          Multiple invocations are possible, each will be passed in order. 
    --num-executors NUM  Number of executors to start (Default: 2) 
    --executor-cores NUM  Number of cores for the executors (Default: 1). 
    --driver-memory MEM  Memory for driver (e.g. 1000M, 2G) (Default: 512 Mb) 
    --executor-memory MEM  Memory per executor (e.g. 1000M, 2G) (Default: 1G) 
    --name NAME    The name of your application (Default: Spark) 
    --queue QUEUE    The hadoop queue to use for allocation requests (Default: 'default') 
    --addJars jars    Comma separated list of local jars that want SparkContext.addJar to work with. 
    --files files    Comma separated list of files to be distributed with the job. 
    --archives archives  Comma separated list of archives to be distributed with the job. 

所以,我把我的spark-assembly.jar更新爲hdfs並且spark開始工作得很好