我是HDInsight的新手。我想學習和練習機器學習,HDInsight就是我想要的,但似乎沒有直接的API來mahout。由於象夫建議將轉化爲實質上mapredure的工作,所以我也跟着在Windows Azure上的文檔一些MapReduce的例子,寫了下面的代碼:如何使用HDInsight .NET SDK提交mahout推薦作業
// Define the MapReduce job
MapReduceJobCreateParameters mrJobDefinition = new MapReduceJobCreateParameters()
{
JarFile = "wasb:///example/jars/mahout-core-0.9-job.jar",
ClassName = "org.apache.mahout.cf.taste.hadoop.item.RecommenderJob",
};
mrJobDefinition.Arguments.Add(" -s SIMILARITY_COOCCURRENCE");
mrJobDefinition.Arguments.Add(" --input=/reply");
mrJobDefinition.Arguments.Add(" --output=/recommend/");
mrJobDefinition.Arguments.Add(" --usersFile=/data/users.txt");
我已經上傳了「象夫核-0.9-job.jar」到指定的Azure blob存儲容器中的/ example/jar。
但我接收到以下錯誤消息:
14/04/03 12時04分28秒ERROR security.UserGroupInformation:PriviledgedActionException爲:約翰尼原因:java.io.IOException的 :讀取異常file:/ c:/ apps/temp/hdfs/mapred/local/taskTracker/johnny/jobcache/job_201404031203_0001/jobToken = java.security.PrivilegedActionException:java.io.IOException:異常讀取文件:/ c :/應用/溫度/ HDFS/mapred /本地/的TaskTracker /約翰尼/ jobcache/J obj201404031203_0001/jobToken = at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation .java:1233) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:951) at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) at org.apache .hadoop.mapreduce.Job.waitForCompletion(Job.java:580) at org.apache.mahout.cf.taste.hadoop.preparation.PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:77) at org.apache.hadoop.util .ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:164) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:322) at sun。 reflect.NativeMethodAccessorImpl.invoke0(本機方法) 在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 在java.lang.reflect.Method中.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) 引起:java.io.IOException:讀取文件異常:/ c:/ apps/temp/hdfs/mapred/local/taskTracker/johnny/jobcache/job_201404031203_0001/jobToken = at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials。 java:136) at org.apache.hadoop.mapred.JobClient.readTokensFromFiles(JobClient.java:2149) at org.apache.hadoop.mapred.JobClient.populateTokenCache(JobClient.java:2185) at org.apache。 hadoop.mapred.JobClient.access $ 300(JobClient.java:179) at org.apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:964) at org.apache.hadoop.mapred.JobClient $ 2.run (JobClient.java:951) ...... 16多個 造成的:java.io.FileNotFoundException:文件文件:/ C:/應用/溫度/ HDFS/mapred /本地/的TaskTracker /約翰尼/ jobcache/job_201404031203_0001/jobToken =不存在。 在org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:427) 在org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:254) 在org.apache.hadoop.fs。 ChecksumFileSystem $ ChecksumFSInputChecker(ChecksumFileSystem。java:125) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436) at org.apache。 hadoop.security.Credentials.readTokenStorageFile(Credentials.java:130) ... 21更多 線程「main」異常java.io.IOException:讀取文件異常:/ c:/ apps/temp/hdfs/mapred/local/taskTracker/johnny/jobcache/job_201404031203_0001/jobToken = at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:136) at org.apache.hadoop.mapred.JobClient.readTokensFromFiles(JobClient.java:2149 ) at org.apache.hadoop.mapred.JobClient.populateTokenCache(JobClient.java:2185) at org.apache.hadoop.mapred.JobClient.access $ 300(JobCl ient.java:179) at org.apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:964) at org.apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:951) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1233) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:951) at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) at org.apache.hadoop.mapreduce.Job .waitForCompletion(Job.java:580) at org.apache.mahout.cf.taste.hadoop.preparation.PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:77) at org.a pache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:164) at org.apache.hadoop。 util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:322) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method ) 在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 在java.lang.reflect.Method.invoke(Method.java:601 ) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) 引起者:java.io.FileNotFoundException:Fi le文件:/ c:/ apps/temp/hdfs/mapred/local/taskTracker/johnny/jobcache/job_201404031203_0001/jobToken =不存在。 在org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:427) 在org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:254) 在org.apache.hadoop.fs。 ChecksumFileSystem $ ChecksumFSInputChecker。(ChecksumFileSystem.java:125) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436 ) 在org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:130) ...... 21多個 關閉觀察家/永葆線程池有力 鄧普頓:作業失敗,退出代碼1
當我在互聯網上搜索後,似乎應該對mapred-site.xml或其他hadoop配置文件進行一些更改。但是我對Apache hadoop完全陌生,並且對Linux和Java沒有太多的瞭解。
任何幫助或方向將不勝感激。