sqoop導出--update-key從hdfs到mssql

我需要用update-key做一個sqoop導出來更新mircosoft sql中的一個表。sqoop導出--update-key從hdfs到mssql

我有我可以成功導出到MSSQL表蜂房表：

sqoop export --connect jdbc:sqlserver://{some.ip.address};database={somedatabase}/--username 'someuser'/
--password-file '/some/password/file'/
--table 'Sometable'/
--columns ID,value1,value2/
--export-dir /apps/hive/warehouse/some.db/Sometable/
--input-fields-terminated-by "||"/-m 2/
/user/oozie/share/lib/sqoop/sqljdbc4.jar

但是，我希望更新上的一個鍵，然後運行：

sqoop export --connect jdbc:sqlserver://{some.ip.address};database={somedatabase}/
    --username 'someuser'/
    --password-file '/some/password/file'/
    --table 'Sometable'/
    --columns ID,value1,value2/
    --export-dir /apps/hive/warehouse/some.db/Sometable/
    --input-fields-terminated-by "||"/
    --update-key ID/
    --update-mode allowinsert/
    -m 2/
    /user/oozie/share/lib/sqoop/sqljdbc4.jar

日誌是非常無益（注意：sqoop通過一個Oozie的作業中運行）：

... 
5972 [main] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1485423751090_3566 
6016 [main] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://had003.headquarters.7layer.net:8088/proxy/application_1485423751090_3566/ 
6016 [main] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://had003.headquarters.7layer.net:8088/proxy/application_1485423751090_3566/ 
6017 [main] INFO org.apache.hadoop.mapreduce.Job - Running job: job_1485423751090_3566 
6017 [main] INFO org.apache.hadoop.mapreduce.Job - Running job: job_1485423751090_3566 
20284 [main] INFO org.apache.hadoop.mapreduce.Job - Job job_1485423751090_3566 running in uber mode : false 
20284 [main] INFO org.apache.hadoop.mapreduce.Job - Job job_1485423751090_3566 running in uber mode : false 
20287 [main] INFO org.apache.hadoop.mapreduce.Job - map 0% reduce 0% 
20287 [main] INFO org.apache.hadoop.mapreduce.Job - map 0% reduce 0% 
27001 [main] INFO org.apache.hadoop.mapreduce.Job - map 50% reduce 0% 
27001 [main] INFO org.apache.hadoop.mapreduce.Job - map 50% reduce 0% 
Heart beat 
37117 [main] INFO org.apache.hadoop.mapreduce.Job - map 100% reduce 0% 
37117 [main] INFO org.apache.hadoop.mapreduce.Job - map 100% reduce 0% 
38139 [main] INFO org.apache.hadoop.mapreduce.Job - Job job_1485423751090_3566 failed with state FAILED due to: Task failed task_1485423751090_3566_m_000000 
Job failed as tasks failed. failedMaps:1 failedReduces:0 


38139 [main] INFO org.apache.hadoop.mapreduce.Job - Job job_1485423751090_3566 failed with state FAILED due to: Task failed task_1485423751090_3566_m_000000 
Job failed as tasks failed. failedMaps:1 failedReduces:0 


38292 [main] INFO org.apache.hadoop.mapreduce.Job - Counters: 32 
    File System Counters 
     FILE: Number of bytes read=0 
     FILE: Number of bytes written=338177 
     FILE: Number of read operations=0 
     FILE: Number of large read operations=0 
     FILE: Number of write operations=0 
     HDFS: Number of bytes read=166 
     HDFS: Number of bytes written=0 
     HDFS: Number of read operations=4 
     HDFS: Number of large read operations=0 
     HDFS: Number of write operations=0 
    Job Counters 
     Failed map tasks=1 
     Launched map tasks=2 
     Other local map tasks=1 
     Rack-local map tasks=1 
     Total time spent by all maps in occupied slots (ms)=16369 
     Total time spent by all reduces in occupied slots (ms)=0 
     Total time spent by all map tasks (ms)=16369 
     Total vcore-milliseconds taken by all map tasks=16369 
     Total megabyte-milliseconds taken by all map tasks=25142784 
    Map-Reduce Framework 
     Map input records=0 
     Map output records=0 
     Input split bytes=156 
     Spilled Records=0 
     Failed Shuffles=0 
     Merged Map outputs=0 
     GC time elapsed (ms)=79 
     CPU time spent (ms)=960 
     Physical memory (bytes) snapshot=230920192 
     Virtual memory (bytes) snapshot=3235606528 
     Total committed heap usage (bytes)=162529280 
    File Input Format Counters 
     Bytes Read=0 
    File Output Format Counters 
     Bytes Written=0 
38292 [main] INFO org.apache.hadoop.mapreduce.Job - Counters: 32 
    File System Counters 
     FILE: Number of bytes read=0 
     FILE: Number of bytes written=338177 
     FILE: Number of read operations=0 
     FILE: Number of large read operations=0 
     FILE: Number of write operations=0 
     HDFS: Number of bytes read=166 
     HDFS: Number of bytes written=0 
     HDFS: Number of read operations=4 
     HDFS: Number of large read operations=0 
     HDFS: Number of write operations=0 
    Job Counters 
     Failed map tasks=1 
     Launched map tasks=2 
     Other local map tasks=1 
     Rack-local map tasks=1 
     Total time spent by all maps in occupied slots (ms)=16369 
     Total time spent by all reduces in occupied slots (ms)=0 
     Total time spent by all map tasks (ms)=16369 
     Total vcore-milliseconds taken by all map tasks=16369 
     Total megabyte-milliseconds taken by all map tasks=25142784 
    Map-Reduce Framework 
     Map input records=0 
     Map output records=0 
     Input split bytes=156 
     Spilled Records=0 
     Failed Shuffles=0 
     Merged Map outputs=0 
     GC time elapsed (ms)=79 
     CPU time spent (ms)=960 
     Physical memory (bytes) snapshot=230920192 
     Virtual memory (bytes) snapshot=3235606528 
     Total committed heap usage (bytes)=162529280 
    File Input Format Counters 
     Bytes Read=0 
    File Output Format Counters 
     Bytes Written=0 
38319 [main] INFO org.apache.sqoop.mapreduce.ExportJobBase - Transferred 166 bytes in 34.0574 seconds (4.8741 bytes/sec) 
38319 [main] INFO org.apache.sqoop.mapreduce.ExportJobBase - Transferred 166 bytes in 34.0574 seconds (4.8741 bytes/sec) 
38332 [main] INFO org.apache.sqoop.mapreduce.ExportJobBase - Exported 0 records. 
38332 [main] INFO org.apache.sqoop.mapreduce.ExportJobBase - Exported 0 records. 
38332 [main] ERROR org.apache.sqoop.mapreduce.ExportJobBase - Export job failed! 
38332 [main] ERROR org.apache.sqoop.mapreduce.ExportJobBase - Export job failed! 
38333 [main] ERROR org.apache.sqoop.tool.ExportTool - Error during export: Export job failed! 
38333 [main] ERROR org.apache.sqoop.tool.ExportTool - Error during export: Export job failed! 


<<< Invocation of Sqoop command completed <<< 


Hadoop Job IDs executed by Sqoop: job_1485423751090_3566 


Intercepting System.exit(1) 


<<< Invocation of Main class completed <<< 


Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1] 


Oozie Launcher failed, finishing Hadoop job gracefully 


Oozie Launcher, uploading action data to HDFS sequence file: hdfs://{Something} 
38406 [main] INFO org.apache.hadoop.io.compress.zlib.ZlibFactory - Successfully loaded & initialized native-zlib library 
38407 [main] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new compressor [.deflate] 


Oozie Launcher ends 


38538 [main] INFO org.apache.hadoop.mapred.Task - Task:attempt_1485423751090_3565_m_000000_0 is done. And is in the process of committing 
38538 [main] INFO org.apache.hadoop.mapred.Task - Task:attempt_1485423751090_3565_m_000000_0 is done. And is in the process of committing 
38601 [main] INFO org.apache.hadoop.mapred.Task - Task attempt_1485423751090_3565_m_000000_0 is allowed to commit now 
38601 [main] INFO org.apache.hadoop.mapred.Task - Task attempt_1485423751090_3565_m_000000_0 is allowed to commit now 
38641 [main] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_1485423751090_3565_m_000000_0' to hdfs://{Something} 
38692 [main] INFO org.apache.hadoop.mapred.Task - Task 'attempt_1485423751090_3565_m_000000_0' done. 
38692 [main] INFO org.apache.hadoop.mapred.Task - Task 'attempt_1485423751090_3565_m_000000_0' done.

有沒有人有一個想法，爲什麼我不能upda te插入到mssql？它不支持？我做錯了什麼？

來源

2017-03-20 antipko

--update-mode allowinsert可能是這裏的罪魁禍首。

按照文檔：

根據目標數據庫上，你也可以用allowinsert模式指定--update-mode說法，如果你想，如果他們存在於數據庫中已經或插入行，如果他們做到及時更新行尚不存在。

爲了實現這一目標，Sqoop使用

INSERT INTO … ON DUPLICATE KEY UPDATE

所以這UPSERT功能（這種類型的查詢的）不是所有的RDBMS支持。

查看更多about upsert in sqoop。

來源

2017-03-20 21:16:24

嗨開發，謝謝你的答案。我正在使用sqoop 1.4.6，並在發佈說明[link]（https://sqoop.apache.org/docs/1.4.6/sqoop-1.4.6.releasenotes.html）中說它支持它：新功能： SQOOP-1403 SQL Server的Upsert導出 – antipko

@antipko感謝您的參考。你可以在你的sqoop命令中加入'--verbose'來檢查擴展日誌。目前的日誌無法得出任何結論。 –

sqoop導出--update-key從hdfs到mssql

回答

相關問題