2014-07-14 173 views
1

我試圖運行一個非常簡單的工作來測試我的hadoop安裝程序,所以我嘗試使用字數計數示例,它陷入了0%,所以我嘗試了一些其他簡單的工作,他們中的每一個被困Hadoop YARN作業在地圖0%處減少並減少0%

52191_0003/ 
14/07/14 23:55:51 INFO mapreduce.Job: Running job: job_1405376352191_0003 
14/07/14 23:55:57 INFO mapreduce.Job: Job job_1405376352191_0003 running in uber mode : false 
14/07/14 23:55:57 INFO mapreduce.Job: map 0% reduce 0% 

我使用Hadoop版本 - Hadoop的2.3.0-cdh5.0.2

我做了對谷歌的快速調研,發現增加

yarn.scheduler.minimum-allocation-mb 
yarn.nodemanager.resource.memory-mb 

我有單節點羣集,在雙核和8 GB RAM的我的Macbook中運行。

我紗site.xml文件 -

<configuration> 

<!-- Site specific YARN configuration properties --> 
    <property> 
    <property> 
    <name>yarn.resourcemanager.hostname</name> 
    <value>resourcemanager.company.com</value> 
    </property> 
    <property> 
    <description>Classpath for typical applications.</description> 
    <name>yarn.application.classpath</name> 
    <value> 
     $HADOOP_CONF_DIR, 
     $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*, 
     $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*, 
     $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*, 
     $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/* 
    </value> 
    </property> 

    <property> 
    <name>yarn.nodemanager.local-dirs</name> 
    <value>file:///data/1/yarn/local,file:///data/2/yarn/local,file:///data/3/yarn/local</value> 
    </property> 
    <property> 
    <name>yarn.nodemanager.log-dirs</name> 
    <value>file:///data/1/yarn/logs,file:///data/2/yarn/logs,file:///data/3/yarn/logs</value> 
    </property> 
    <property> 
    </property> 
    <name>yarn.log.aggregation.enable</name> 
    <value>true</value> 
    <property> 
    <description>Where to aggregate logs</description> 
    <name>yarn.nodemanager.remote-app-log-dir</name> 
    <value>hdfs://var/log/hadoop-yarn/apps</value> 
    </property> 
    <property> 
    <name>yarn.nodemanager.aux-services</name> 
    <value>mapreduce_shuffle</value> 
    <description>shuffle service that needs to be set for Map Reduce to run </description> 
    </property> 
    <property> 
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 
    <value>org.apache.hadoop.mapred.ShuffleHandler</value> 
    </property> 
    </property> 

    <property> 
     <name>yarn.app.mapreduce.am.resource.mb</name> 
     <value>8092</value> 
    </property> 
    <property> 
     <name>yarn.app.mapreduce.am.command-opts</name> 
     <value>-Xmx768m</value> 
    </property> 
    <property> 
     <name>mapreduce.framework.name</name> 
     <value>yarn</value> 
     <description>Execution framework.</description> 
    </property> 
    <property> 
     <name>mapreduce.map.cpu.vcores</name> 
     <value>4</value> 
     <description>The number of virtual cores required for each map task.</description> 
    </property> 
    <property> 
     <name>mapreduce.map.memory.mb</name> 
     <value>8092</value> 
     <description>Larger resource limit for maps.</description> 
    </property> 
    <property> 
     <name>mapreduce.map.java.opts</name> 
     <value>-Xmx768m</value> 
     <description>Heap-size for child jvms of maps.</description> 
    </property> 
    <property> 
     <name>mapreduce.jobtracker.address</name> 
     <value>jobtracker.alexjf.net:8021</value> 
    </property> 

<property> 
    <name>yarn.scheduler.minimum-allocation-mb</name> 
    <value>2048</value> 
    <description>Minimum limit of memory to allocate to each container request at the Resource Manager.</description> 
    </property> 
    <property> 
    <name>yarn.scheduler.maximum-allocation-mb</name> 
    <value>8092</value> 
    <description>Maximum limit of memory to allocate to each container request at the Resource Manager.</description> 
    </property> 
    <property> 
    <name>yarn.scheduler.minimum-allocation-vcores</name> 
    <value>2</value> 
    <description>The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this won't take effect, and the specified value will get allocated the minimum.</description> 
    </property> 
    <property> 
    <name>yarn.scheduler.maximum-allocation-vcores</name> 
    <value>10</value> 
    <description>The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value.</description> 
    </property> 
    <property> 
    <name>yarn.nodemanager.resource.memory-mb</name> 
    <value>2048</value> 
    <description>Physical memory, in MB, to be made available to running containers</description> 
    </property> 
    <property> 
    <name>yarn.nodemanager.resource.cpu-vcores</name> 
    <value>4</value> 
    <description>Number of CPU cores that can be allocated for containers.</description> 
    </property> 
    <property> 
    <name>yarn.nodemanager.aux-services</name> 
    <value>mapreduce_shuffle</value> 
    <description>shuffle service that needs to be set for Map Reduce to run </description> 
    </property> 
    <property> 
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 
    <value>org.apache.hadoop.mapred.ShuffleHandler</value> 
    </property> 

</configuration> 

我mapred-site.xml中

<property>  
    <name>mapreduce.framework.name</name>  
    <value>yarn</value> 
    </property> 

只有1個屬性。 嘗試了幾個排列組合,但無法擺脫錯誤。

登錄工作

23:55:55,694 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 
2014-07-14 23:55:55,697 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 
2014-07-14 23:55:55,699 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030 
2014-07-14 23:55:55,769 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: maxContainerCapability: 8092 
2014-07-14 23:55:55,769 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: queue: root.abhishekchoudhary 
2014-07-14 23:55:55,775 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500 
2014-07-14 23:55:55,777 INFO [main] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: yarn.client.max-nodemanagers-proxies : 500 
2014-07-14 23:55:55,787 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1405376352191_0003Job Transitioned from INITED to SETUP 
2014-07-14 23:55:55,789 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP 
2014-07-14 23:55:55,800 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1405376352191_0003Job Transitioned from SETUP to RUNNING 
2014-07-14 23:55:55,823 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1405376352191_0003_m_000000 Task Transitioned from NEW to SCHEDULED 
2014-07-14 23:55:55,824 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1405376352191_0003_m_000001 Task Transitioned from NEW to SCHEDULED 
2014-07-14 23:55:55,824 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1405376352191_0003_m_000002 Task Transitioned from NEW to SCHEDULED 
2014-07-14 23:55:55,825 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1405376352191_0003_m_000003 Task Transitioned from NEW to SCHEDULED 
2014-07-14 23:55:55,826 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1405376352191_0003_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED 
2014-07-14 23:55:55,827 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1405376352191_0003_m_000001_0 TaskAttempt Transitioned from NEW to UNASSIGNED 
2014-07-14 23:55:55,827 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1405376352191_0003_m_000002_0 TaskAttempt Transitioned from NEW to UNASSIGNED 
2014-07-14 23:55:55,827 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1405376352191_0003_m_000003_0 TaskAttempt Transitioned from NEW to UNASSIGNED 
2014-07-14 23:55:55,828 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceReqt:8092 
2014-07-14 23:55:55,858 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1405376352191_0003, File: hdfs://localhost/tmp/hadoop-yarn/staging/abhishekchoudhary/.staging/job_1405376352191_0003/job_1405376352191_0003_1.jhist 
2014-07-14 23:55:56,773 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0 
2014-07-14 23:55:56,799 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1405376352191_0003: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:0, vCores:0> knownNMs=1 

回答

0

的我不知道,如果你只是建立這個問題,但看你的yarn-site.xml它有兩個<property>標籤開始時做了複製/粘貼錯誤。我不確定Hadoop的xml解析器是否會實際應用嵌套的<property>標籤。

+0

它確實需要它們..不會通過解析異常,因爲它們都需要屬性內的屬性列表..我刪除並嘗試,以及..但沒有運氣 –

1

基於messsage Connecting to ResourceManager at /0.0.0.0:8030, 你確定你的ResourceManager應該在0.0.0.0:8030(默認)嗎? 如果你不應該添加以下到您yarn-site.xml

<property> 
    <name>yarn.resourcemanager.hostname</name> 
    <value>MASTER ADDRESS</value> 
</property> 
<property> 
    <name>yarn.resourcemanager.resource-tracker.address</name> 
    <value>${yarn.resourcemanager.hostname}:8025</value> 
</property> 
<property> 
    <name>yarn.resourcemanager.scheduler.address</name> 
    <value>${yarn.resourcemanager.hostname}:8030</value> 
</property> 
<property> 
    <name>yarn.resourcemanager.address</name> 
    <value>${yarn.resourcemanager.hostname}:8040</value> 
</property> 
<property> 
    <name>yarn.resourcemanager.webapp.address</name> 
    <value>${yarn.resourcemanager.hostname}:8088</value> 
</property> 
<property> 
    <name>yarn.resourcemanager.admin.address</name> 
    <value>${yarn.resourcemanager.hostname}:8033</value> 
</property> 

更換主地址與主節點的地址。您可以單獨更改資源管理器的Web應用程序,管理員等的地址。

0

我使用的是Apache Hadoop版本2.7.2,因此它可能類似於「蘋果與橘子」的比較,但是我遇到了同樣的沉默前些日子卡住了。在大多數情況下,長時間「沉默」表明調度程序無法爲應用程序分配足夠的資源。

在我與類似配置的具體情況,在紗的site.xml增加財產yarn.nodemanager.resource.memory-MB值的伎倆。

您還可以查看其他屬性的資源分配here

1

您的設置是不正確的。

設置yarn.nodemanager.resource.memory-mb 設置爲2GB。這是物理內存的「」,以MB爲單位,可以分配給容器。「」但是您的mapreduce.map.memory.mb8GB。 8GB是你真正要求的。

此外,您已將yarn.app.mapreduce.am.resource.mb設置爲8GB。因此,您正試圖分配一個控制8GB作業的AM以及8GB的多個映射器。

解決方案

爲了解決這個問題,你可以刪除AM大小1GB,然後mapper大小.5GB,這是玩弄尤其是對於字數一個更合理的大小。

其他資源

你可以參考由Clouera提供this instruction更詳細地瞭解這些屬性。

+0

親愛的格蘭特,歡迎來到StackOverflow,並感謝你回答這個問題。 請爲您的答案提供一個示例。如果您提及官方消息來源和文件,這將是非常好的,但請進一步詳細說明。這是因爲在這裏給出的答案應儘可能全面。 請參閱幫助中心的[如何提供良好答案](https://stackoverflow.com/help/how-to-answer)。最後,請正確構建您的答案/問題/意見(我將在此時更正)。 – Pouria