原始問題(長版本以下)。短版本:使用ruby腳本運行hadoop流,因爲安裝在所有羣集節點上的映射器和rvm不起作用。因爲ruby未被hadoop啓動的shell識別(並且rvm未被正確加載)。爲什麼?使用RVM進行Hadoop流式處理無法找到Gem
我想使用wukong
作爲gem創建hadoop的map/reduce作業。問題是wukong
gem無法通過hadoop加載(即未找到)。 Hadoop作業給我以下錯誤:
/usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/site_ruby/1.9.1/rubygems/custom_require.rb:36:in `require': cannot load such file -- wukong (LoadError)
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/site_ruby/1.9.1/rubygems/custom_require.rb:36:in `require'
from /tmp/mapr-hadoop/mapred/local/taskTracker/admin/jobcache/job_201207061102_0068/attempt_201207061102_0068_m_000000_0/work/./test.rb:6:in `<main>'
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:394)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1109)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
但是,這樣做對所有的集羣機器cat somefile | ./test.rb --map
按預期工作。此外,我還在我的測試文件中包含了一些調試打印,我可以從中檢索hadoop日誌。當運行
$stderr.puts `gem list`
它產生所有的寶石,包括wukong
,也
$stderr.puts $LOAD_PATH.inspect
產生了examt相同的路徑,因爲它打印$LOAD_PATH
運行的本地(而不是由Hadoop的推出)Ruby腳本時一樣。
爲什麼hadoop啓動ruby腳本沒有找到gem這是明確安裝並正常工作?
Hadoop是推出爲:
hadoop jar /opt/mapr/hadoop/hadoop-0.20.2/contrib/streaming/hadoop-0.20.2-dev-streaming.jar \
-libjars /opt/hypertable/current/lib/java/hypertable-0.9.5.6.jar,/opt/hypertable/current/lib/java/libthrift-0.8.0.jar \
-Dmapred.child.env="PATH=$PATH:/usr/local/rvm/bin/rvm" \
-mapper '/home/admin/wukong/test.rb --map' \
-file /home/admin/wukong/test.rb \
-reducer /bin/cat \
-input /test/test.rb \
-output /test/something2