2013-05-04 111 views
0

我得到了有關未找到文件的下列錯誤。那麼...文件存在。我是一個distcp新手。我正在使用cloudera FYI。從s3到hadoop的distcp - 文件未找到

https://s3.amazonaws.com/test-development/test/201305031003_0_ubuntu.gz 


[email protected]:~$ hadoop distcp -i 201305031003_0_ubuntu.gz s3://id:[email protected]/test/201305031003_0_ubuntu.gz 
13/05/04 14:54:29 INFO tools.DistCp: srcPaths=[201305031003_0_ubuntu.gz] 
13/05/04 14:54:29 INFO tools.DistCp: destPath=s3://id:[email protected]/test/201305031003_0_ubuntu.gz 
With failures, global counters are inaccurate; consider running with -i 
Copy failed: org.apache.hadoop.mapred.InvalidInputException: Input source 201305031003_0_ubuntu.gz does not exist. 
    at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:641) 
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656) 
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) 
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) 
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) 
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) 

回答

2

的第一個參數是源,因此它應該是路徑S3和路徑應該是S3N://,而不是S3://(本機S3),除非你使用寫入的數據S3 s3://(塊文件系統)

+0

你的意思是如果數據是用「s3://」寫的,那麼你只能用「s3://」來檢索它,和「s3n://一樣」 「? – soulmachine 2014-12-18 22:23:44