2014-07-10 36 views
1

因此,我刪除了我的主機,然後嘗試再次添加它。 DataNode工作正常,但我無法讓Nodemanager工作。刪除後,我用yum刪除了hadoop紗線包,然後使用ambari再次安裝它。現在,當我嘗試使用ambari啓動Nodemanager時,出現以下錯誤:在閱讀ambari後無法啓動NodeManager

2014-05-23 19:40:41,507 - Execute['export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && /usr/lib/hadoop-yarn/sbin/yarn-daemon.sh --config /etc/hadoop/conf start nodemanager'] {'not_if': 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1', 'user': 'yarn'} 
2014-05-23 19:40:42,570 - Execute['ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1'] {'initial_wait': 5, 'not_if': 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1', 'user': 'yarn'} 
2014-05-23 19:40:47,621 - Error while executing command 'start': 
Traceback (most recent call last): 
    File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 112, in execute 
    method(env) 
    File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/YARN/package/scripts/nodemanager.py", line 42, in start 
    action='start' 
    File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/YARN/package/scripts/service.py", line 51, in service 
    initial_wait=5 
    File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__ 
    self.env.run() 
    File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149, in run 
    self.run_action(resource, action) 
    File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 115, in run_action 
    provider_action() 
    File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 239, in action_run 
    raise ex 
Fail: Execution of 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1' returned 1. 

所以我沒有真正解決問題。如果我嘗試用紗線手動啓動它節點管理器啓動我得到這個錯誤:

14/07/10 13:44:48 FATAL nodemanager.NodeManager: Error starting NodeManager 
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: Disallowed NodeManager from r3888, Sending SHUTDOWN signal to the NodeManager. 
     at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:196) 
     at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) 
     at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:197) 
     at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:358) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:404) 
Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: Disallowed NodeManager from r3888, Sending SHUTDOWN signal to the NodeManager. 
     at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:265) 
     at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:190) 
     ... 6 more 
14/07/10 13:44:48 INFO nodemanager.NodeManager: SHUTDOWN_MSG: 
/************************************************************ 
SHUTDOWN_MSG: Shutting down NodeManager at r3888 

有沒有人有刪除/與ambari在主機上增加一個NameNode的類似的問題?我想避免從地面完全建立主機。

回答

2

我們實際上遇到了同樣的情況,我們無法將之前已知的名稱重新用於羣集。原來的節點是明確地在紗線Resoure經理的排除列表中列出,所以有:

  • 取下/etc/hadoop/conf/yarn.exclude
  • 呼叫被重新使用yarn rmadmin -refreshNodes所以YARN重新讀取這個配置的名稱文件

在我們的例子中,節點管理器啓動得很好,乾淨地重新註冊。

+0

感謝您的信息!我其實已經擺脫了這個問題。有一些錯誤刪除/添加相同的主機,但我認爲他們修復了未來版本 – Hannes