2012-09-29 107 views
2

我在EMR上運行了一個羣集作業。 數據集很大。一切運行良好,直到:EMR上的Mahout錯誤:Java堆空間

2012-09-29 10:50:58,063 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 23% 
2012-09-29 10:51:31,157 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 24% 
2012-09-29 10:51:50,197 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 25% 
2012-09-29 10:52:17,236 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 26% 
2012-09-29 10:52:41,270 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 27% 
2012-09-29 10:53:08,350 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 28% 
2012-09-29 10:53:29,377 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 29% 
2012-09-29 10:53:54,411 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 30% 
2012-09-29 10:54:21,448 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 31% 
2012-09-29 10:54:48,486 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 32% 
Error: Java heap space 
attempt_201209271139_0004_r_000000_0: SLF4J: Class path contains multiple SLF4J bindings. 
attempt_201209271139_0004_r_000000_0: SLF4J: Found binding in [jar:file:/home/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
attempt_201209271139_0004_r_000000_0: SLF4J: Found binding in [jar:file:/mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201209271139_0004/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
attempt_201209271139_0004_r_000000_0: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 

所以。基本的問題是如何解決這個問題?

回答

1

我創建了m1.large實例而不是m1.small,並運行了s3:// elasticmapreduce/bootstrap-actions/configurations/latest/memory-intensive bootstrap動作。它有幫助。

所以。我在我的工作流創建cmd行中添加了2行:

> --master-instance-type m1.large --slave-instance-type m1.large \ 
> --bootstrap-action s3://elasticmapreduce/bootstrap-actions/configurations/latest/memory-intensive 

也就是說。