我有一個節點Hadoop羣集版本 - 2.x.我設置的塊大小爲64 MB。我有一個大小爲84 MB的HDFS輸入文件。現在,當我運行MR作業時,我看到有2個分割是有效的,分別爲84 MB/64 MB〜2和2個分割。單個節點羣集中的Hadoop塊大小需要明確
但是當我運行命令「hadoop fsck -blocks」來查看塊的細節時,我看到了這一點。
Total size: 90984182 B
Total dirs: 16
Total files: 7
Total symlinks: 0
Total blocks (validated): 7 (avg. block size 12997740 B)
Minimally replicated blocks: 7 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 1
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 1
Number of racks: 1
如您所見,平均塊大小接近13 MB。爲什麼是這樣?理想情況下,塊大小應該是64 MB rite?
[No.文件與HDFS中塊的數量](http://stackoverflow.com/questions/21275082/no-of-files-vs-no-of-blocks-in-hdfs) – emeth