我想獲得一個蜂房Hadoop的蒙戈的設置工作。我已經導入數據到MongoDB中從JSON文件,然後我創建了蜂巢連接到蒙戈內部和外部表:蜂巢崩潰的where子句
CREATE EXTERNAL TABLE reviews(
user_id STRING,
review_id STRING,
stars INT,
date1 STRING,
text STRING,
type STRING,
business_id STRING
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"date1":"date"}')
TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/test.reviews');
這部分工作得很好,因爲選擇的所有查詢(select * from reviews
)喜歡它輸出的一切應該。但是,當我做一個有where子句(select * from reviews where stars=4
例如)蜂巢崩潰。
我有,當我啓動蜂巢添加以下jar文件:
add jar mongo-hadoop.jar;
add jar mongo-java-driver-3.3.0.jar;
add jar mongo-hadoop-hive-2.0.1.jar;
而如果是在任何意義上相關的,我使用Amazon的EMR集羣對於這一點,我通過ssh連接。
感謝所有幫助
以下是錯誤蜂巢拋出:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.exec.Utilities.deserializeExpression(Ljava/lang/String;)Lorg/apache/hadoop/hive/ql/plan/ExprNodeGenericFuncDesc;
at com.mongodb.hadoop.hive.input.HiveMongoInputFormat.getFilter(HiveMongoInputFormat.java:134)
at com.mongodb.hadoop.hive.input.HiveMongoInputFormat.getRecordReader(HiveMongoInputFormat.java:103)
at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:691)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:329)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:455)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:424)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:144)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1885)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:252)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
我現在收到此錯誤:'失敗:執行錯誤,從org.apache.hadoop.hive.ql.exec.DDLTask返回碼1。 MetaException(消息:GOT異常:java.io.IOException的無文件系統的方案:mongodb的)' – Jonathan
在一個相關的說明,這將意味着,蜂巢將讀取的MongoDB轉儲文件,而不是直接查詢蒙戈? – Jonathan