2016-03-07 39 views
2

我的一些ADF作業隨機失敗,輸出指向下面/ PackageJobs /〜job/Status/stderr文件中的數據。Azure數據工廠作業在Hadoop/Map Reduce中失敗?

請注意,這並不總是會發生,它會在某些作業中隨機出現,而其他作業會正常完成。

什麼可能導致此問題?

標準錯誤數據如下:

log4j:ERROR Could not instantiate class  [com.microsoft.log4jappender.FilterLogAppender]. 
java.lang.ClassNotFoundException: com.microsoft.log4jappender.FilterLogAppender 
at java.net.URLClassLoader$1.run(URLClassLoader.java:366) 
at java.net.URLClassLoader$1.run(URLClassLoader.java:355) 
at java.security.AccessController.doPrivileged(Native Method) 
at java.net.URLClassLoader.findClass(URLClassLoader.java:354) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:425) 
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:358) 
at java.lang.Class.forName0(Native Method) 
at java.lang.Class.forName(Class.java:190) 
at org.apache.log4j.helpers.Loader.loadClass(Loader.java:198) 
at org.apache.log4j.helpers.OptionConverter.instantiateByClassName(OptionConverter.java:327) 
at org.apache.log4j.helpers.OptionConverter.instantiateByKey(OptionConverter.java:124) 
at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:785) 
at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768) 
at org.apache.log4j.PropertyConfigurator.parseCatsAndRenderers(PropertyConfigurator.java:672) 
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:516) 
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580) 
at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526) 
at org.apache.log4j.LogManager.<clinit>(LogManager.java:127) 
at org.apache.log4j.Logger.getLogger(Logger.java:104) 
at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:262) 
at org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:108) 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) 
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1025) 
at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:844) 
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:541) 
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:292) 
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:269) 
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:657) 
at org.apache.hadoop.util.ShutdownHookManager.<clinit>(ShutdownHookManager.java:44) 
at org.apache.hadoop.util.RunJar.run(RunJar.java:200) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 
log4j:ERROR Could not instantiate appender named "RMSUMFilterLog". 
16/03/04 10:56:02 INFO impl.TimelineClientImpl: Timeline service address: http://headnodehost:8188/ws/v1/timeline/ 
16/03/04 10:56:02 INFO client.RMProxy: Connecting to ResourceManager at headnodehost/100.74.24.3:9010 
16/03/04 10:56:02 INFO client.AHSProxy: Connecting to Application History server at headnodehost/100.74.24.3:10200 
16/03/04 10:56:03 INFO impl.TimelineClientImpl: Timeline service address: http://headnodehost:8188/ws/v1/timeline/ 
16/03/04 10:56:03 INFO client.RMProxy: Connecting to ResourceManager at headnodehost/100.74.24.3:9010 
16/03/04 10:56:03 INFO client.AHSProxy: Connecting to Application History server at headnodehost/100.74.24.3:10200 
16/03/04 10:56:06 INFO mapred.FileInputFormat: Total input paths to process : 1 
16/03/04 10:56:06 INFO mapreduce.JobSubmitter: number of splits:1 
16/03/04 10:56:06 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 
16/03/04 10:56:06 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 
16/03/04 10:56:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1457068773628_0022 
16/03/04 10:56:07 INFO mapreduce.JobSubmitter: Kind: mapreduce.job, Service: job_1457068773628_0019, Ident: ([email protected]5019bc) 
16/03/04 10:56:08 INFO impl.YarnClientImpl: Submitted application application_1457068773628_0022 
16/03/04 10:56:08 INFO mapreduce.Job: The url to track the job: http://headnodehost:9014/proxy/application_1457068773628_0022/ 
16/03/04 10:56:08 INFO mapreduce.Job: Running job: job_1457068773628_0022 
16/03/04 10:56:18 INFO mapreduce.Job: Job job_1457068773628_0022 running in uber mode : false 
16/03/04 10:56:18 INFO mapreduce.Job: map 0% reduce 0% 
16/03/04 10:56:31 INFO mapreduce.Job: map 100% reduce 0% 
16/03/04 23:48:59 INFO mapreduce.Job: Task Id : attempt_1457068773628_0022_m_000000_0, Status : FAILED 
AttemptID:attempt_1457068773628_0022_m_000000_0 Timed out after 600 secs 
16/03/04 23:49:00 INFO mapreduce.Job: map 0% reduce 0% 
16/03/04 23:49:16 INFO mapreduce.Job: map 100% reduce 0% 
16/03/05 00:01:00 INFO mapreduce.Job: Task Id : attempt_1457068773628_0022_m_000000_1, Status : FAILED 
AttemptID:attempt_1457068773628_0022_m_000000_1 Timed out after 600 secs 
16/03/05 00:01:01 INFO mapreduce.Job: map 0% reduce 0% 
16/03/05 00:01:21 INFO mapreduce.Job: map 100% reduce 0% 
16/03/05 00:13:00 INFO mapreduce.Job: Task Id : attempt_1457068773628_0022_m_000000_2, Status : FAILED 
AttemptID:attempt_1457068773628_0022_m_000000_2 Timed out after 600 secs 
16/03/05 00:13:01 INFO mapreduce.Job: map 0% reduce 0% 
16/03/05 00:13:18 INFO mapreduce.Job: map 100% reduce 0% 
16/03/05 00:25:03 INFO mapreduce.Job: Job job_1457068773628_0022 failed with state FAILED due to: Task failed task_1457068773628_0022_m_000000 
Job failed as tasks failed. failedMaps:1 failedReduces:0 

16/03/05 00:25:03 INFO mapreduce.Job: Counters: 9 
Job Counters 
    Failed map tasks=4 
    Launched map tasks=4 
    Other local map tasks=3 
    Rack-local map tasks=1 
    Total time spent by all maps in occupied slots (ms)=48514665 
    Total time spent by all reduces in occupied slots (ms)=0 
    Total time spent by all map tasks (ms)=48514665 
    Total vcore-seconds taken by all map tasks=48514665 
    Total megabyte-seconds taken by all map tasks=74518525440 
16/03/05 00:25:03 ERROR streaming.StreamJob: Job not successful! 
Streaming Command Failed! 

回答

0

這看起來像使用Hadoop/HDI已知的超時問題。如果一個活動在控制檯上沒有寫任何東西10分鐘,它就會被殺死。你可以請修改你的代碼,每隔9分鐘在控制檯上寫一個ping,看看它是否可行

+0

我會嘗試並更新,謝謝。 –

+0

我沒有再次遇到這個問題,但我從未對活動做過任何更改,可能是內部問題。 –