2015-08-25 50 views
2

我想在我的Windows使用的Spark與Hadoop的沿8。但是不管我的代碼是什麼,我收到此錯誤:設置HADOOP_HOME變量在Windows

15/08/25 19:29:58 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path 
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. 
    at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355) 
    at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370) 
    at org.apache.hadoop.util.Shell.<clinit>(Shell.java:363) 
    at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79) 
    at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:104) 
    at org.apache.hadoop.security.Groups.<init>(Groups.java:86) 
    at org.apache.hadoop.security.Groups.<init>(Groups.java:66) 
    at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280) 
    at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271) 
    at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:248) 
    at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:763) 
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:748) 
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:621) 
    at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2162) 
    at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2162) 
    at scala.Option.getOrElse(Option.scala:120) 
    at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2162) 
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:301) 
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source) 
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) 
    at java.lang.reflect.Constructor.newInstance(Unknown Source) 
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) 
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) 
    at py4j.Gateway.invoke(Gateway.java:214) 
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) 
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) 
    at py4j.GatewayConnectionun(GatewayConnection.java:207) 
    at java.lang.Thread.run(Unknown Source) 

正如你可以看到:

null\bin\winutils.exe 

hadoop home路徑爲null。我試圖設置HADOOP_HOME作爲一個環境變量,但沒有解決這個問題。任何幫助或評論關於這將不勝感激。

感謝

回答

2

我設法利用在開始時將以下代碼部分來解決這個問題:

import sys 
import os 

os.environ['HADOOP_HOME'] = "C:/Mine/Spark/hadoop-2.6.0" 
sys.path.append("C:/Mine/Spark/hadoop-2.6.0/bin") 

希望這可以幫助別人,也如果任何人有一個更好的主意,我肯定會明白, 。