2016-05-18 91 views
2

我試圖在Windows 10安裝星火1.6.1,到目前爲止,我已經做了以下...winutils火花Windows安裝

  1. 下載火花1.6.1,解壓到某個目錄,然後設置SPARK_HOME
  2. 下載2.11.8階,解壓到某個目錄,然後設置SCALA_HOME
  3. 設置_JAVA_OPTION環境變量
  4. 由剛下載的zip目錄,然後設置HADOOP_HOME環境變量從https://github.com/steveloughran/winutils.git下載winutils。 (不知道這是否不正確,我無法克隆目錄,因爲權限被拒絕)。

當我去火花家和運行BIN \火花殼我得到

'C:\Program' is not recognized as an internal or external command, operable program or batch file. 

我必須失去了一些東西,我不明白我怎麼可以從Windows反正運行的bash腳本環境。但希望我不需要明白只是爲了得到這個工作。我一直在關注這個人的教程 - https://hernandezpaul.wordpress.com/2016/01/24/apache-spark-installation-on-windows-10/。任何幫助,將不勝感激。

回答

3

您需要下載winutils可執行文件,而不是源代碼。

你可以下載它here,或者如果你真的想要整個Hadoop發行版,你可以找到2.6.0的二進制文件here。然後,您需要將HADOOP_HOME設置爲包含winutils.exe的目錄。

另外,確保放入Spark的目錄是一個不包含空白的目錄,這是非常重要的,否則它將無法工作。

一旦你設置它,你不開始spark-shell.sh,你開始spark-shell.cmd

C:\Spark\bin>spark-shell 
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). 
log4j:WARN Please initialize the log4j system properly. 
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. 
Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties 
To adjust logging level use sc.setLogLevel("INFO") 
Welcome to 
     ____    __ 
    /__/__ ___ _____/ /__ 
    _\ \/ _ \/ _ `/ __/ '_/ 
    /___/ .__/\_,_/_/ /_/\_\ version 1.6.1 
     /_/ 

Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_91) 
Type in expressions to have them evaluated. 
Type :help for more information. 
Spark context available as sc. 
16/05/18 19:31:56 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-core-3.2.10.jar." 
16/05/18 19:31:56 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-api-jdo-3.2.6.jar." 
16/05/18 19:31:56 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-rdbms-3.2.9.jar." 
16/05/18 19:31:56 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 
16/05/18 19:31:56 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 
16/05/18 19:32:01 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 
16/05/18 19:32:01 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException 
16/05/18 19:32:07 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-core-3.2.10.jar." 
16/05/18 19:32:07 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-api-jdo-3.2.6.jar." 
16/05/18 19:32:07 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-rdbms-3.2.9.jar." 
16/05/18 19:32:07 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 
16/05/18 19:32:08 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 
16/05/18 19:32:12 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 
16/05/18 19:32:12 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException 
SQL context available as sqlContext. 

scala> 
+0

感謝這麼多的幫助,在路徑空間,並幫助!我運行spark-shell時遇到了另一個與構建spark有關的錯誤「Failed to find Spark assembly JAR。 您需要在運行此程序之前構建Spark。」我會檢查這一個幫助。我很高興我發佈了這個問題,我沒有想到這只是一個空間問題,但它是有道理的,因爲強大的命令行解析不會與像這樣的實用程序有關 –

+1

@Mike我同意,但這就是我們所擁有的:\ 。 –

+0

嗨Yuval,是32位還是64位winutils?在嘗試初始化SQL上下文時,我仍然遇到了一個錯誤,我試圖追趕它,正好在spark-shell啓動的過程中。它給了我下面的警告:「你的主機名,DELE-6565解析爲一個環回/不可達地址:fe80:0:0:0:0:5efe:c0a8:103%net1,但是我們找不到任何外部IP地址!」然後拋出異常... 「java.lang.RuntimeException:java.lang.NullPointerException。」我試圖找出這個原因。 –