我想直接從MySQL數據導入到實木複合地板,但它似乎沒有正常工作......Sqoop進口--as-parquetfile與CDH5
我使用CDH5.3包括Sqoop 1.4.5。
這裏是我的命令行:
sqoop import --connect jdbc:mysql://xx.xx.xx.xx/database --username username --password mypass --query 'SELECT page_id,user_id FROM pages_users WHERE $CONDITIONS' --split-by page_id --hive-import --hive-table default.pages_users3 --target-dir hive_pages_users --as-parquetfile
然後我得到這個錯誤:
Warning: /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/01/09 14:31:49 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.0
15/01/09 14:31:49 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/01/09 14:31:49 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
15/01/09 14:31:49 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
15/01/09 14:31:49 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
15/01/09 14:31:49 INFO tool.CodeGenTool: Beginning code generation
15/01/09 14:31:50 INFO manager.SqlManager: Executing SQL statement: SELECT page_id,user_id FROM pages_users WHERE (1 = 0)
15/01/09 14:31:50 INFO manager.SqlManager: Executing SQL statement: SELECT page_id,user_id FROM pages_users WHERE (1 = 0)
15/01/09 14:31:50 INFO manager.SqlManager: Executing SQL statement: SELECT page_id,user_id FROM pages_users WHERE (1 = 0)
15/01/09 14:31:50 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/b90e7b492f5b66554f2cca3f88ef7a61/QueryResult.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/01/09 14:31:51 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/b90e7b492f5b66554f2cca3f88ef7a61/QueryResult.jar
15/01/09 14:31:51 INFO mapreduce.ImportJobBase: Beginning query import.
15/01/09 14:31:51 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/01/09 14:31:51 INFO manager.SqlManager: Executing SQL statement: SELECT page_id,user_id FROM pages_users WHERE (1 = 0)
15/01/09 14:31:51 INFO manager.SqlManager: Executing SQL statement: SELECT page_id,user_id FROM pages_users WHERE (1 = 0)
15/01/09 14:31:51 WARN spi.Registration: Not loading URI patterns in org.kitesdk.data.spi.hive.Loader
15/01/09 14:31:51 ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.DatasetNotFoundException: Unknown dataset URI: hive?dataset=default.pages_users3
org.kitesdk.data.DatasetNotFoundException: Unknown dataset URI: hive?dataset=default.pages_users3
at org.kitesdk.data.spi.Registration.lookupDatasetUri(Registration.java:109)
at org.kitesdk.data.Datasets.create(Datasets.java:189)
at org.kitesdk.data.Datasets.create(Datasets.java:240)
at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:81)
at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:70)
at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:112)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:262)
at org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:721)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:499)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
我沒有問題,將數據導入配置單元文件格式,但實木複合地板是一個問題......你有什麼想法爲什麼發生這種情況?
謝謝:)
從文檔:當導入自由格式查詢,您必須使用--target-dir指定目標目錄。 – timfeirg