2014-10-30 84 views
1

我試圖使用Sqoop將Vertica中的表導入到DataStax Enterprise 4.5。沒有報告錯誤或異常,但目標表中沒有數據。無法使用Sqoop將數據從Vertica導入到Cassandra

這裏是我做過什麼:

在Cqlsh創建密鑰空間和表:

CREATE KEYSPACE IF NOT EXISTS npa_nxx WITH replication = { 
    'class': 'SimpleStrategy', 'replication_factor': '1' }; 

CREATE TABLE npa_nxx.npa_nxx_data (
    region varchar, market varchar, 
PRIMARY KEY(market)); 

創建一個選項表:

cql-import 
--table 
dim_location 
--cassandra-keyspace 
npa_nxx 
--cassandra-table 
npa_nxx_data 
--cassandra-column-mapping 
region:region,market:market 
--connect 
jdbc:vertica://xx.xxx.xx.xxx:5433/schema 
--driver 
com.vertica.jdbc.Driver 
--username 
xxxxx 
--password 
xxx 
--cassandra-host 
xx.xxx.xx.xxx 

然後執行sqoop命令:

dse sqoop --options-file /usr/share/dse/demos/sqoop/import.options 

而他re是全部輸出:

14/10/30 09:28:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
14/10/30 09:28:53 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 
14/10/30 09:28:53 INFO manager.SqlManager: Using default fetchSize of 1000 
14/10/30 09:28:53 INFO tool.CodeGenTool: Beginning code generation 
14/10/30 09:28:54 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM dim_location AS t WHERE 1=0 
14/10/30 09:28:54 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM dim_location AS t WHERE 1=0 
14/10/30 09:28:54 INFO orm.CompilationManager: $HADOOP_MAPRED_HOME is not set 
Note: /tmp/sqoop-root/compile/159b8e57e91397f8c48f4455f6da0e5a/dim_location.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
14/10/30 09:28:55 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/159b8e57e91397f8c48f4455f6da0e5a/dim_location.jar 
14/10/30 09:28:55 INFO mapreduce.ImportJobBase: Beginning import of dim_location 
14/10/30 09:28:56 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM dim_location AS t WHERE 1=0 
14/10/30 09:28:56 INFO snitch.Workload: Setting my workload to Cassandra 
14/10/30 09:28:58 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
14/10/30 09:28:59 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(MARKET), MAX(MARKET) FROM dim_location 
14/10/30 09:28:59 WARN db.TextSplitter: Generating splits for a textual index column. 
14/10/30 09:28:59 WARN db.TextSplitter: If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records. 
14/10/30 09:28:59 WARN db.TextSplitter: You are strongly encouraged to choose an integral split column. 
14/10/30 09:29:00 INFO mapred.JobClient: Running job: job_201410291321_0012 
14/10/30 09:29:01 INFO mapred.JobClient: map 0% reduce 0% 
14/10/30 09:29:18 INFO mapred.JobClient: map 20% reduce 0% 
14/10/30 09:29:22 INFO mapred.JobClient: map 40% reduce 0% 
14/10/30 09:29:25 INFO mapred.JobClient: map 60% reduce 0% 
14/10/30 09:29:28 INFO mapred.JobClient: map 80% reduce 0% 
14/10/30 09:29:31 INFO mapred.JobClient: map 100% reduce 0% 
14/10/30 09:29:34 INFO mapred.JobClient: Job complete: job_201410291321_0012 
14/10/30 09:29:34 INFO mapred.JobClient: Counters: 18 
14/10/30 09:29:34 INFO mapred.JobClient: Job Counters 
14/10/30 09:29:34 INFO mapred.JobClient:  SLOTS_MILLIS_MAPS=29652 
14/10/30 09:29:34 INFO mapred.JobClient:  Total time spent by all reduces waiting after reserving slots (ms)=0 
14/10/30 09:29:34 INFO mapred.JobClient:  Total time spent by all maps waiting after reserving slots (ms)=0 
14/10/30 09:29:34 INFO mapred.JobClient:  Launched map tasks=5 
14/10/30 09:29:34 INFO mapred.JobClient:  SLOTS_MILLIS_REDUCES=0 
14/10/30 09:29:34 INFO mapred.JobClient: File Output Format Counters 
14/10/30 09:29:34 INFO mapred.JobClient:  Bytes Written=2003 
14/10/30 09:29:34 INFO mapred.JobClient: FileSystemCounters 
14/10/30 09:29:34 INFO mapred.JobClient:  FILE_BYTES_WRITTEN=130485 
14/10/30 09:29:34 INFO mapred.JobClient:  CFS_BYTES_WRITTEN=2003 
14/10/30 09:29:34 INFO mapred.JobClient:  CFS_BYTES_READ=664 
14/10/30 09:29:34 INFO mapred.JobClient: File Input Format Counters 
14/10/30 09:29:34 INFO mapred.JobClient:  Bytes Read=0 
14/10/30 09:29:34 INFO mapred.JobClient: Map-Reduce Framework 
14/10/30 09:29:34 INFO mapred.JobClient:  Map input records=98 
14/10/30 09:29:34 INFO mapred.JobClient:  Physical memory (bytes) snapshot=985702400 
14/10/30 09:29:34 INFO mapred.JobClient:  Spilled Records=0 
14/10/30 09:29:34 INFO mapred.JobClient:  CPU time spent (ms)=1260 
14/10/30 09:29:34 INFO mapred.JobClient:  Total committed heap usage (bytes)=1249378304 
14/10/30 09:29:34 INFO mapred.JobClient:  Virtual memory (bytes) snapshot=8317739008 
14/10/30 09:29:34 INFO mapred.JobClient:  Map output records=98 
14/10/30 09:29:34 INFO mapred.JobClient:  SPLIT_RAW_BYTES=664 
14/10/30 09:29:34 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 38.8727 seconds (0 bytes/sec) 
14/10/30 09:29:34 INFO mapreduce.ImportJobBase: Retrieved 98 records. 

任何人對這裏發生了什麼有什麼想法?謝謝!下面的命令

+0

- 很想知道什麼原因造成的,並在該目錄是我應該正在刪除.. – 2014-11-08 19:09:44

+0

dim_location表的結構是什麼? – mikea 2014-11-09 17:44:16

回答

1

運行知道你的文件在CFS:具有從SQL Server到卡桑德拉使用DSE同樣的問題

dse hadoop fs -ls <location given in target directory> 
相關問題