讓假設有限制的文本文件選項卡(datetemp.txt)我希望加載這個文本文件中豬進行處理,但是當我鍵入以下行其給我的錯誤是:如何在PIG中導入/加載.csv文件?
咕嚕> inputfile中=負載「 /training/pig/datetemp.txt'使用PigStorage()As(EventID:chararray,eventdate:chararray,count:int);
grunt> dump inputfile;
2014-09-06 08:41:23,527 [main] INFO org.apache.pig.tools.pigstats.ScriptState - 腳本中使用的Pig特徵:UNKNOWN 2014-09-06 08:41:23,544 [主] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - 文件連接閾值:100樂觀? false 2014-09-06 08:41:23,548 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - 優化前的MR計劃大小:1 2014-09-06 08:41:23,548 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - 優化後的MR計劃大小:1 2014-09-06 08:41:23,551 [main] INFO org.apache.pig.tools。 pigstats.ScriptState - 豬腳本設置被添加到作業中 2014-09-06 08:41:23,551 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset .buffer.percent未設置,設置爲默認值0.3 2014-09-06 08:41:23,552 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - 創建jar文件Job2739171785773930333.jar 2014-09-06 08:42: 39,608 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - jar文件Job2739171785773930333.jar創建者: 2014-09-06 08:42:39,612 [main] INFO org.apache.pig.backend。 hadoop.executionengine.mapReduceLayer.JobControlCompiler - 設置單店作業 2014-09-06 08:42:39,619 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job( s)等待提交。 2014-09-06 08:42:39,630 WARN org.apache.hadoop.mapred.JobClient - 使用GenericOptionsParser解析參數。應用程序應該實現相同的工具。 2014-09-06 08:42:39,891 [線程-12] INFO org.apache.hadoop.mapred.JobClient - 清理臨時區域hdfs://192.168.195.130:8020/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/training/.staging/job_201408292336_0009 2014-09-06 08:42:39,891 [線程-12] ERROR org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:training(auth:SIMPLE)原因:org.apache.pig.backend.executionengine.ExecException:錯誤2118:輸入路徑不存在:hdfs://192.168.195.130:8020/training/pig/datetemp.txt 2014-09-06 08:42: 40,119 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0%complete 2014-09-06 08:42:40,125 [main] INFO org.apache.pig.backend.hadoop.executionengine .mapReduceLayer.MapReduceLauncher - 作業null失敗!停止運行所有相關的作業 2014年9月6日08:42:40125 [主] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100%完成 2014年9月6日08:42:40131 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - 錯誤2997:無法從後端錯誤重新創建異常:org.apache.pig.backend.executionengine.ExecException:錯誤2118:輸入路徑不存在:hdfs: //192.168.195.130:8020/training/pig/datetemp.txt at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285) at org.apache.hadoop.mapred .JobClient.writeNewSplits(JobClient.java:1014) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1031) at org.apache.hadoop.mapred.JobClient.access $ 600(JobClient.java: 172) 在org.apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:943) 在org.apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:896) 在java.security。AccessController.doPrivileged(本地方法) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache .hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896) at org.apache.hadoop.mapreduce.Job.submit(Job.java:531) at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob .submit(ControlledJob.java:318) at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.startReadyJobs(JobControl.java:238) at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run (JobControl.java:269) 在java.lang.Thread.run(Thread.java:662) 在org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLaun雪兒$ 1.run(MapReduceLauncher.java:260) 造成的:org.apache.hadoop.mapreduce.lib.input.InvalidInputException:輸入路徑不存在:HDFS://192.168.195.130:8020 /培訓/頭/ datetemp .txt at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java: (PigInputFormat.java:0) 273) ... 15更多
2014-09-06 08:42:40,131 [main] ERROR org.apache.pig.to ols.pigstats.PigStatsUtil - 1個地圖減少工作失敗! 2014年9月6日08:42:40135 [主] INFO org.apache.pig.tools.pigstats.SimplePigStats - 腳本統計:
HadoopVersion PigVersion用戶ID StartedAt FinishedAt特點 2.0.0 cdh4.1.1 0.10。 0-cdh4.1.1培訓2014年9月6日8點41分23秒2014年9月6日8時42分40秒未知
失敗!
失敗作業: 的JobId別名功能消息輸出 N/A inputfile中MAP_ONLY消息:org.apache.pig.backend.executionengine.ExecException:ERROR 2118:輸入路徑不存在:HDFS://192.168.195.130: 8020/training/pig/datetemp.txt at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient的.java:1014) 在org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1031) 在org.apache.hadoop.mapred.JobClient.access $ 600(JobClient.java:172) 的組織。 apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:943) 在org.apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:896) 在java.security.AccessController.doPrivileged(本機方法) 在javax.security.auth.Subject.doAs(Subject.java: 396) 在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) 在org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896) 在org.apache.hadoop。 mapreduce.Job.submit(Job.java:531) at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:318) at org.apache.hadoop.mapreduce.lib.jobcontrol。 JobControl.startReadyJobs(JobControl.java:238) at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:269) at java.lang.Thread.run(Thread.ja va:662) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher $ 1.run(MapReduceLauncher.java:260) 引起:org.apache.hadoop.mapreduce.lib.input.InvalidInputException:Input路徑不存在:hdfs://192.168.195.130:8020/training/pig/datetemp。txt at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36 ) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:248) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273 ) ... 15個 HDFS://192.168.195.130:8020/TMP/TEMP-1004538676/tmp1582688785,
輸入(S): 無法從「/training/pig/datetemp.txt讀取數據「
產出: 無法產生導致 「HDFS://192.168.195.130:8020/TMP/TEMP-1004538676/tmp1582688785」
計數器: 總記錄寫入:0 總字節寫入:0 濺灑內存管理器溢出次數:0 總包主動瀉:0 記錄合計主動瀉:0
工作DAG: 空
2014年9月6日08:42:40135 [主] INFO組織.apache.pig.backend.hadoop.executionengine .mapReduceLayer.MapReduceLauncher - 失敗! 2014-09-06 08:42:40,142 [main] ERROR org.apache.pig.tools.grunt.Grunt - 錯誤1066:無法打開別名輸入文件的迭代器 日誌文件的詳細信息:/home/training/pig_1410006833865.log
請幫我這裏.. !!
對於在尋找[錯誤1066:無法打開迭代器別名]時發現此帖子的人(http://stackoverflow.com/questions/34495085/error-1066-unable-to- open-iterator-for-alias-in-pig-generic-solution)這裏是[通用解決方案](http://stackoverflow.com/a/34495086/983722)。 – 2015-12-28 15:06:39