2016-08-11 46 views
2

HDP-2.4.2.0-258使用Ambari 2.2.2.0Sqoop進口HCatalog /蜂巢 - 表不可見

安裝我不得不進口應通過蜂巢,豬,MR以及任何可以訪問多個SQL Server架構第三方(將來)。我決定導入HCatalog。

Sqoop提供了導入到Hive或HCatalog的方法,我想如果我導入HCatalog,同樣的表格可以從Hive CLI,MR和Pig訪問(請評估我的假設)。

問題:

  • 如果進口直接攆,將表提供給豬,MR ?
  • 如果導入HCatalog,需要通過Hive訪問什麼?
  • 是否需要在Hive中預先創建表?如果是,在HCatalog中導入的優點是什麼(我可以直接在Hive中導入)/(在HDFS中導入,然後創建外部表)?

問題: 我希望達到的一個步驟如下:

  • 導入數據(從SQL Server表)
  • 避免「預創建」或寫創建語句對於這些表(其中有100個)
  • 以ORC格式存儲表
  • 將此數據存儲在自定義HDFS路徑say/org/data/schema1,/ org/data/schema2等等(因爲Sqoop說,它(--target-DIR /這是不可能的 - 倉庫 - DIR)

我執行以下命令:

-bash-4.2$ sqoop import --connect 'jdbc:sqlserver://<IP>;database=FleetManagement' --username --password --table SettingAttribute -- --schema Administration --hcatalog-home /usr/hdp/current/hive-webhcat --hcatalog-database default --hcatalog-table SettingAttribute --create-hcatalog-table --hcatalog-storage-stanza "stored as orcfile" 

源表包含109條記錄,那些被取:

hdfs dfs -ls /user/ojoqcu/SettingAttribute 
Found 5 items 
-rw------- 3 ojoqcu hdfs   0 2016-08-10 15:02 /user/ojoqcu/SettingAttribute/_SUCCESS 
-rw------- 3 ojoqcu hdfs  8378 2016-08-10 15:02 /user/ojoqcu/SettingAttribute/part-m-00000 
-rw------- 3 ojoqcu hdfs  144 2016-08-10 15:02 /user/ojoqcu/SettingAttribute/part-m-00001 
-rw------- 3 ojoqcu hdfs  1123 2016-08-10 15:02 /user/ojoqcu/SettingAttribute/part-m-00002 
-rw------- 3 ojoqcu hdfs  434 2016-08-10 15:02 /user/ojoqcu/SettingAttribute/part-m-00003 

16/08/10 15:02:27 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258 
16/08/10 15:02:27 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
16/08/10 15:02:28 INFO manager.SqlManager: Using default fetchSize of 1000 
16/08/10 15:02:28 INFO manager.SQLServerManager: We will use schema Administration 
16/08/10 15:02:28 INFO tool.CodeGenTool: Beginning code generation 
16/08/10 15:02:28 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM [Administration].[SettingAttribute] AS t WHERE 1=0 
16/08/10 15:02:28 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.4.2.0-258/hadoop-mapreduce 
Note: /tmp/sqoop-ojoqcu/compile/dfab14748c41a566ec286b7e4b11004d/SettingAttribute.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
16/08/10 15:02:30 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-ojoqcu/compile/dfab14748c41a566ec286b7e4b11004d/SettingAttribute.jar 
16/08/10 15:02:30 INFO mapreduce.ImportJobBase: Beginning import of SettingAttribute 
SLF4J: Class path contains multiple SLF4J bindings. 
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 
16/08/10 15:02:31 INFO impl.TimelineClientImpl: Timeline service address: http://l4373t.sss.com:8188/ws/v1/timeline/ 
16/08/10 15:02:31 INFO client.RMProxy: Connecting to ResourceManager at l4283t.sss.com/138.106.9.80:8050 
16/08/10 15:02:33 INFO db.DBInputFormat: Using read commited transaction isolation 
16/08/10 15:02:33 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN([SettingAttributeId]), MAX([SettingAttributeId]) FROM [Administration].[SettingAttribute] 
16/08/10 15:02:33 INFO mapreduce.JobSubmitter: number of splits:4 
16/08/10 15:02:33 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1467787344827_0013 
16/08/10 15:02:34 INFO impl.YarnClientImpl: Submitted application application_1467787344827_0013 
16/08/10 15:02:34 INFO mapreduce.Job: The url to track the job: http://l4283t.sss.com:8088/proxy/application_1467787344827_0013/ 
16/08/10 15:02:34 INFO mapreduce.Job: Running job: job_1467787344827_0013 
16/08/10 15:02:41 INFO mapreduce.Job: Job job_1467787344827_0013 running in uber mode : false 
16/08/10 15:02:41 INFO mapreduce.Job: map 0% reduce 0% 
16/08/10 15:02:47 INFO mapreduce.Job: map 100% reduce 0% 
16/08/10 15:02:48 INFO mapreduce.Job: Job job_1467787344827_0013 completed successfully 
16/08/10 15:02:48 INFO mapreduce.Job: Counters: 30 
     File System Counters 
       FILE: Number of bytes read=0 
       FILE: Number of bytes written=616636 
       FILE: Number of read operations=0 
       FILE: Number of large read operations=0 
       FILE: Number of write operations=0 
       HDFS: Number of bytes read=540 
       HDFS: Number of bytes written=10079 
       HDFS: Number of read operations=16 
       HDFS: Number of large read operations=0 
       HDFS: Number of write operations=8 
     Job Counters 
       Launched map tasks=4 
       Other local map tasks=4 
       Total time spent by all maps in occupied slots (ms)=16132 
       Total time spent by all reduces in occupied slots (ms)=0 
       Total time spent by all map tasks (ms)=16132 
       Total vcore-seconds taken by all map tasks=16132 
       Total megabyte-seconds taken by all map tasks=66076672 
     Map-Reduce Framework 
       Map input records=109 
       Map output records=109 
       Input split bytes=540 
       Spilled Records=0 
       Failed Shuffles=0 
       Merged Map outputs=0 
       GC time elapsed (ms)=320 
       CPU time spent (ms)=6340 
       Physical memory (bytes) snapshot=999870464 
       Virtual memory (bytes) snapshot=21872697344 
       Total committed heap usage (bytes)=943194112 
     File Input Format Counters 
       Bytes Read=0 
     File Output Format Counters 
       Bytes Written=10079 
16/08/10 15:02:48 INFO mapreduce.ImportJobBase: Transferred 9.8428 KB in 17.2115 seconds (585.597 bytes/sec) 
16/08/10 15:02:48 INFO mapreduce.ImportJobBase: Retrieved 109 records. 

文件被我的用戶下創建

我不能看到HCatalog(也不在蜂巢)任何

-bash-4.2$ /usr/hdp/2.4.2.0-258/hive-hcatalog/bin/hcat -e "show tables in default;" 
WARNING: Use "yarn jar" to launch YARN applications. 
16/08/10 15:07:12 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist 
OK 
Time taken: 2.007 seconds 

有一些授權問題?

我檢查了var/log,但沒有Sqoop,Hive-Hcatalog和Hive存在,我如何查看授權問題並修復它?

回答

0

嗯,我不確定這是授權問題還是僅僅解析問題,或者兩者兼而有之。我做了以下和它的工作:

  1. 難道一個su hive
  2. 執行以下命令(也許,-- --schema應 最後ARG,Sqoop簡單地後,忽略/休息!)

    sqoop import --hcatalog-home /usr/hdp/current/hive-webhcat --hcatalog-database FleetManagement_Ape --hcatalog-table DatabaseLog --create-hcatalog-table --hcatalog-storage-stanza "stored as orcfile" --connect 'jdbc:sqlserver://<IP>;database=FleetManagement' --username --password --table DatabaseLog -- --schema ape

+0

@dev感謝編輯,我無法格式化代碼:) –