我對運行hadoop mapreduce作業有疑問。我有一個工作人員,按加入日期分區。 創建類似這樣的語句:hadoop map reduce job pending too long
create table staff(id int,age int)被(join_date string)分隔的行格式以'\;'結尾的分隔字段;
我把一些數據分區之前「20130921」,那麼當我執行語句波紋管,結果是正常:
select count(*) from staff where join_date='20130921';**
但是,當我在分區「20130922」執行(分區沒有數據),地圖減少工作等待時間太長,看起來像是永遠運行:
hive> select count(*) from staff where join_date='20130922';**
Total MapReduce jobs = 1**
Launching Job 1 out of 1**
**Number of reduce tasks determined at compile time: 1**
**In order to change the average load for a reducer (in bytes):**
set hive.exec.reducers.bytes.per.reducer=<number>**
**In order to limit the maximum number of reducers:**
set hive.exec.reducers.max=<number>**
**In order to set a constant number of reducers:**
set mapred.reduce.tasks=<number>**
**Starting Job** = `job_201309231116_0131, Tracking URL = ....jobid=job_201309231116_0131`
**Kill Command** = `/u01/hadoop-0.20.203.0/bin/../bin/hadoop job -kill job_201309231116_0131`
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1
2013-09-23 17:19:07,182 Stage-1 map = 0%, reduce = 0%
2013-09-23 17:19:07,182 Stage-1 map = 0%, reduce = 0%
2013-09-23 17:19:07,182 Stage-1 map = 0%, reduce = 0%
jobtracker顯示減少任務掛起和這個工作似乎可以完成。
我使用hadoop-0.20.203.0和hive-0.10.0。我整天Google搜索,但沒有發現任何話題有同樣的問題,請幫助我。
此致敬禮。
你在TaskTracker日誌中發現了什麼有趣的東西嗎? – Tariq
我跟蹤jobtracker,tasktracker,作業日誌的日誌,但沒有發現任何警告或錯誤日誌。我用不使用分區的表測試'select count(*)'語句,結果是一樣的,map減少了job的不能完成。我嘗試使用屬性'mapreduce.task.timeout',但hadoop不殺工作。 – user2806318