2014-02-28 43 views
1

出口從hdsf和雙精度浮點數到MySQL我使用Hadoop版本1.2.1和1.4.4 sqoop問題使用sqoop

我是新來的Hadoop/sqoop時遇到的一個問題。我有數據在hdfs中,我想導出到MySQL但導出保持失敗。 我所用的語句是:

sqoop出口--connect的jdbc:mysql的:// {IP地址}/{}數據庫用戶名--username -P --table {}表名--export-dir的{出口DIR} --input場終止-用 '' --lines終止的,由 '\ n' --verbose

我得到的錯誤是:

14/02/28 10:12:40 INFO mapred.JobClient: Running job: job_201402040959_0234 
14/02/28 10:12:41 INFO mapred.JobClient: map 0% reduce 0% 
14/02/28 10:12:51 INFO mapred.JobClient: map 50% reduce 0% 
14/02/28 10:22:51 INFO mapred.JobClient: map 0% reduce 0% 
14/02/28 10:22:52 INFO mapred.JobClient: Task Id : attempt_201402040959_0234_m_000000_0, Status : FAILED 
Task attempt_201402040959_0234_m_000000_0 failed to report status for 600 seconds. Killing! 
14/02/28 10:22:52 INFO mapred.JobClient: Task Id : attempt_201402040959_0234_m_000001_0, Status : FAILED 
Task attempt_201402040959_0234_m_000001_0 failed to report status for 600 seconds. Killing! 
14/02/28 10:23:00 INFO mapred.JobClient: map 50% reduce 0% 
14/02/28 10:33:00 INFO mapred.JobClient: map 0% reduce 0% 
14/02/28 10:33:00 INFO mapred.JobClient: Task Id : attempt_201402040959_0234_m_000000_1, Status : FAILED 
Task attempt_201402040959_0234_m_000000_1 failed to report status for 600 seconds. Killing! 
14/02/28 10:33:00 INFO mapred.JobClient: Task Id : attempt_201402040959_0234_m_000001_1, Status : FAILED 
Task attempt_201402040959_0234_m_000001_1 failed to report status for 600 seconds. Killing! 
14/02/28 10:33:09 INFO mapred.JobClient: map 50% reduce 0% 
14/02/28 10:43:09 INFO mapred.JobClient: map 0% reduce 0% 
14/02/28 10:43:09 INFO mapred.JobClient: Task Id : attempt_201402040959_0234_m_000000_2, Status : FAILED 
Task attempt_201402040959_0234_m_000000_2 failed to report status for 600 seconds. Killing! 
14/02/28 10:43:10 INFO mapred.JobClient: Task Id : attempt_201402040959_0234_m_000001_2, Status : FAILED 
Task attempt_201402040959_0234_m_000001_2 failed to report status for 600 seconds. Killing! 
14/02/28 10:43:18 INFO mapred.JobClient: map 50% reduce 0% 
14/02/28 10:53:18 INFO mapred.JobClient: map 25% reduce 0% 
14/02/28 10:53:19 INFO mapred.JobClient: map 0% reduce 0% 
14/02/28 10:53:20 INFO mapred.JobClient: Job complete: job_201402040959_0234 
14/02/28 10:53:20 INFO mapred.JobClient: Counters: 7 
14/02/28 10:53:20 INFO mapred.JobClient: Job Counters 
14/02/28 10:53:20 INFO mapred.JobClient:  SLOTS_MILLIS_MAPS=11987 
14/02/28 10:53:20 INFO mapred.JobClient:  Total time spent by all reduces waiting after reserving slots (ms)=0 
14/02/28 10:53:20 INFO mapred.JobClient:  Total time spent by all maps waiting after reserving slots (ms)=0 
14/02/28 10:53:20 INFO mapred.JobClient:  Launched map tasks=8 
14/02/28 10:53:20 INFO mapred.JobClient:  Data-local map tasks=8 
14/02/28 10:53:20 INFO mapred.JobClient:  SLOTS_MILLIS_REDUCES=0 
14/02/28 10:53:20 INFO mapred.JobClient:  Failed map tasks=1 
14/02/28 10:53:20 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 2,441.242 seconds (0 bytes/sec) 
14/02/28 10:53:20 INFO mapreduce.ExportJobBase: Exported 0 records. 
14/02/28 10:53:20 ERROR tool.ExportTool: Error during export: Export job failed! 

的數據的例子是:

201110,1.8181818181818181 
201111,1.4597701149425288 
201112,1.766990291262136 
20119,1.6153846153846154 
20121,1.5857142857142856 
201210,1.55 
201211,1.5294117647058822 
201212,1.6528925619834711 
20122,1.5789473684210527 
20123,1.4848484848484849 
20124,1.654320987654321 
20125,1.5942028985507246 
20126,1.5333333333333334 
20127,1.4736842105263157 
20128,1.4666666666666666 
20129,1.4794520547945205 
20131,1.6875 
201310,8.233183856502242 
201311,8.524886877828054 
201312,9.333333333333334 
20132,1.7272727272727273 
20133,3.42 
20134,6.380597014925373 
20135,9.504716981132075 
20136,8.538812785388128 
20137,8.609649122807017 
20138,8.777272727272727 
20139,8.506787330316742 
20141,4.741784037558685 

我試着用相同的導出語句導出一個類似的數據集,只用整數而不是雙精度,並且成功。我也嘗試過使用浮動而不是雙打的類似數據集,但那也失敗了。請有人給我一個暗示,爲什麼這不起作用?我對不適合MySQL的數據類型做錯了什麼?

我還試圖運行具有以下附加相同的查詢:

-m 1

這給出了相同的錯誤如上述,除了在地圖上步驟完成到100%,而不是僅僅50%。

- 謝謝,請讓我知道,如果我應該提供一些額外的信息。

回答

0

請更新說明Hadoop,Sqoop和MySQL版本的問題,以便可以複製問題。

我打算假設您使用的是Hadoop 0.21.0。如果是這種情況,則可能由org.apache.sqoop.mapreduce.ProgressThread類引起,該類使用TaskInputOutputContext,該問題不能正確報告問題[MAPREDUCE-1905]中所述的潛在報告者。

如果您使用的是0.21.0,那麼您將需要使用0.21.1或其他Hadoop版本。

否則,我會認爲這是ProgressThread中的一些問題或Sqoop如何報告。如果這不起作用,那麼YARN或MR1日誌中可能還有其他內容。

YARN日誌缺省文件夾(在等/ hadoop的/ yarn-env.sh集):

cd $HADOOP_YARN_HOME/logs 

MR1日誌缺省文件夾(在等/ hadoop的/ mapred-env.sh集):

cd $HADOOP_MAPRED_HOME/logs 
+0

我查看了$ HADOOP_MAPRED_HOME/logs中的日誌很長一段時間,但是我仍然沒有看到我的錯誤來自哪裏。有一堆不同的日誌文件,不知何故,我沒有找到任何解釋爲什麼錯誤被拋出的原因。我試着比較成功的整數導出到失敗的浮動導出,我沒有注意到日誌中的差異。你有什麼建議嗎?再次感謝,抱歉,如果我的問題有點模糊。我有點困惑,但我可以嘗試澄清,如果你告訴我。 – Murium

0

錯誤是由於列名中的下劃線。顯然你不能在列名中加下劃線。