2014-05-05 101 views
0

在這裏我試圖使用hiveql執行map reduce操作,它正在爲select查詢工作,但它引發了一些聚合和過濾操作的異常,請幫我解決它。我已將mongo-hadoop罐添加到相應的托盤中MongoDB的Hive表映射

蜂巢> select * from users; 行 1湯姆28 2愛麗絲18 3鮑勃29

蜂房> SELECT * FROM用戶其中年齡> = 20; 總MapReduce作業= 1個 下水作業1出1個 的減少任務的數量設置爲0,因爲沒有降低運營商

Kill Command = /home/administrator/hadoop-2.2.0//bin/hadoop job -kill job_1398687508122_0002 
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 
2014-05-05 12:08:41,195 Stage-1 map = 0%, reduce = 0% 
2014-05-05 12:08:57,723 Stage-1 map = 100%, reduce = 0%`enter code here` 
Ended Job = job_1398687508122_0002 with errors 
Error during job, obtaining debugging information... 
Examining task ID: task_1398687508122_0002_m_000000 (and more) from job job_1398687508122_0002 

Task with the most failures(4): 
----- 
Task ID: 
    task_1398687508122_0002_m_000000 
----- 
Diagnostic Messages for this Task: 
Error: java.io.IOException: java.io.IOException: Couldn't get next key/value from mongodb: 
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) 
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) 
    at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276) 
    at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) 
    at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) 
    at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108) 
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197) 
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) 
Caused by: java.io.IOException: Couldn't get next key/value from mongodb: 
    at com.mongodb.hadoop.mapred.input.MongoRecordReader.nextKeyValue(MongoRecordReader.java:93) 
    at com.mongodb.hadoop.mapred.input.MongoRecordReader.next(MongoRecordReader.java:98) 
    at com.mongodb.hadoop.mapred.input.MongoRecordReader.next(MongoRecordReader.java:27) 
    at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274) 
    ... 13 more 
Caused by: com.mongodb.MongoException$Network: Read operation to server localhost/127.0.0.1:12345 failed on database test 
    at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253) 
    at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216) 
    at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288) 
    at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273) 
    at com.mongodb.DBCursor._check(DBCursor.java:368) 
    at com.mongodb.DBCursor._hasNext(DBCursor.java:459) 
    at com.mongodb.DBCursor.hasNext(DBCursor.java:484) 
    at com.mongodb.hadoop.mapred.input.MongoRecordReader.nextKeyValue(MongoRecordReader.java:80) 
    ... 16 more 
Caused by: java.net.ConnectException: Connection refused 
    at java.net.PlainSocketImpl.socketConnect(Native Method) 
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) 
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) 
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) 
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) 
    at java.net.Socket.connect(Socket.java:579) 
    at com.mongodb.DBPort._open(DBPort.java:223) 
    at com.mongodb.DBPort.go(DBPort.java:125) 
    at com.mongodb.DBPort.call(DBPort.java:92) 
    at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244) 
    ... 23 more 
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask 
MapReduce Jobs Launched: 
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL 
Total MapReduce CPU Time Spent: 0 msec 

回答

1

在蜂巢,「select * from表」工作在不同的模式比任何其他更復雜的查詢。該查詢在單個JVM中的hive客戶端中運行。其邏輯是,查詢最終必須從單一線程將所有內容打印到控制檯,所以從該線程開始做所有事情都不會更糟。其他一切,包括一個簡單的過濾器,都將作爲一個或多個MapReduce作業運行。

當你在沒有過濾器的情況下運行查詢時,我猜你正在MongoDB運行的同一臺機器上這樣做,所以它可以連接到localhost:12345。但是當你運行一個MapReduce作業時,它是另一臺試圖連接的機器:一個任務節點。映射器嘗試連接到「localhost:12345」以從Mongo獲取數據,但無法這樣做。也許Mongo沒有在那臺機器上運行,或者它運行在不同的端口上。我不知道你的集羣是如何配置的。

無論如何,您應該指定MongoDB實例的位置,使羣集中的所有計算機都可以訪問它。如果它有一個非常靜態的本地IP地址,那就可以工作,但更好的辦法是通過主機名和DNS解析來完成。

+0

嗨我更改爲主機名,然後我映射的mongo集合與蜂房表現在它工作正常。謝謝 –