2015-10-01 70 views
3

我有一個hadoop集羣,有6個節點。我將數據從MSSQL中提取出來,然後通過Sqoop返回到MSSQL中。 Sqoop導入命令正常工作,並且我可以從控制檯(在其中一個hadoop節點上)運行sqoop export命令。這裏的shell腳本我運行:Sqoop導出Oozie工作流失敗,找不到文件,從控制檯運行時工作

SQLHOST=sqlservermaster.local 
SQLDBNAME=db1 
HIVEDBNAME=db1 
BATCHID= 
USERNAME="sqlusername" 
PASSWORD="password" 


sqoop export --connect 'jdbc:sqlserver://'$SQLHOST';username='$USERNAME';password='$PASSWORD';database='$SQLDBNAME'' --table ExportFromHive --columns col1,col2,col3 --export-dir /apps/hive/warehouse/$HIVEDBNAME.db/hivetablename  

當我運行從Oozie的工作流這一命令,而且它通過了相同的參數,我收到錯誤(挖成從紗線調度屏幕的實際作業運行日誌時) :

**2015-10-01 20:55:31,084 WARN [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Job init failed 
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://hadoopnode1:8020/user/root/.staging/job_1443713197941_0134/job.splitmetainfo 
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1568) 
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1432) 
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1390) 
    at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) 
    at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) 
    at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) 
    at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) 
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996) 
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138) 
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1312) 
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1080) 
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) 
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1519) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515) 
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1448) 
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://hadoopnode1:8020/user/root/.staging/job_1443713197941_0134/job.splitmetainfo 
    at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309) 
    at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) 
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) 
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) 
    at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:51) 
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1563) 
    ... 17 more** 

有沒有人見過這個,並能夠排除故障?它只發生在oozie工作流程中。有類似的話題,但似乎沒有人解決這個具體問題。

謝謝!

回答

1

我能夠通過在job.properties文件中爲用戶紗線設置oozie工作流程的user.name屬性來解決這個問題。

user.name=yarn 

我認爲問題在於它沒有權限創建/ user/root下的登臺文件。一旦我將正在運行的用戶修改爲紗線,登臺文件就創建在具有適當權限的/ user/yarn下。

+0

我有完全相同的問題,但這不起作用,它仍然在根目錄的.staging目錄下創建作業目錄。 我以'root'身份運行oozie WF作業,我應該用'yarn'來做這件事嗎? – jastang

+0

在我的情況下,我以root用戶身份開始工作,但將job.properties文件中的user.name設置爲yarn,然後在yarn下創建staging目錄並解決了我所看到的問題。 –

+0

嗯,這對我來說很有效,如果我從Oozie CLI作爲'紗線'運行作業,因爲它一直在/ user/root /下創建.staging目錄。不管怎樣,謝謝! – jastang