輸入文件大小:75GB我的紗線地圖,減少工作中採取了大量的時間
映射器的數量:2273
減速數:1(如網頁顯示UI)拆分的
數:2273
數量的輸入文件:867
羣集:阿帕奇的Hadoop 2.4.0
5個節點集羣,1TB每個。
1個主節點和4個Datanodes。
已經過了4小時。現在仍然只有12%的地圖完成。只是想知道我的羣集配置是否有意義,或者配置有什麼問題?
紗的site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux- services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.resource- tracker.address</name>
<value>master:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8040</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
<description>The hostname of the RM.</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
<description>Minimum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
<description>Maximum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
<description>The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this won't take effect, and the specified value will get allocated the minimum.</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>32</value>
<description>The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value.</description>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>8192</value>
<description>Physical memory, in MB, to be made available to running containers</description>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>4</value>
<description>Number of CPU cores that can be allocated for containers.</description>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>4</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
<description>Whether virtual memory limits will be enforced for containers</description>
</property>
地圖,減少在那裏我使用的多路輸出工作。所以Reducer會發出多個文件。每臺機器都有15GB的Ram。運行的容器是8. RM Web UI中的可用內存總量爲32GB。
任何指導表示讚賞。提前致謝。
您能否提供有關您正在運行的工作類型的信息?還有每臺機器的RAM是多少?您是否可以登錄資源管理器UI並檢查羣集可用的總內存以及並行運行的容器數量。我懷疑這項工作正在利用這些資源。 –
@shivanand pawar:Map-Reduce我正在使用多個輸出的工作。所以我會有多個文件。每臺機器都有15GB的Ram。運行的容器是8.可用內存總量是32GB。 – Shash