2017-02-08 81 views
2

我對DataFactory非常陌生,並且在理解如何正確創建將執行存儲過程之前執行復制功能的管道時遇到問題。Azure DataFactory鏈活動

存儲的proc只是目標表的一個TRUNCATE,用作第二個活動的輸出數據集。

從DataFactory文檔中,它告訴我要首先執行存儲的proc,請指定proc的「輸出」作爲第二個活動的「輸入」。

但是,存儲過程沒有真正的「輸出」。爲了讓它「工作」,我克隆了第二個活動的輸出,改變了它的名稱,並使其成爲external=false以使其超過配置錯誤,但這顯然是一個總的混亂。

對於我來說,至少在這個存儲過程執行的動作TRUNCATE的情況下,爲什麼甚至需要定義一個輸出是沒有意義的。

但是,當我嘗試使用存儲過程的輸出作爲附加輸入時,我收到一個有關重複表名的錯誤。

如何獲得TRUNCATE存儲的proc活動以在運行復制活動之前成功執行(並完成)?

這裏的流水線代碼:

{ 
    "name": "Traffic CRM - System User Stage", 
    "properties": { 
     "description": "Move System User to Stage", 
     "activities": [ 
      { 
       "type": "SqlServerStoredProcedure", 
       "typeProperties": { 
        "storedProcedureName": "dbo.usp_Truncate_Traffic_Crm_SystemUser", 
        "storedProcedureParameters": {} 
       }, 
       "outputs": [ 
        { 
         "name": "Smart App - usp Truncate System User" 
        } 
       ], 
       "policy": { 
        "timeout": "01:00:00", 
        "concurrency": 1, 
        "retry": 3 
       }, 
       "scheduler": { 
        "frequency": "Day", 
        "interval": 1 
       }, 
       "name": "Smart App - SystemUser Truncate" 
      }, 
      { 
       "type": "Copy", 
       "typeProperties": { 
        "source": { 
         "type": "SqlSource", 
         "sqlReaderQuery": "select * from [dbo].[Traffic_Crm_SystemUser]" 
        }, 
        "sink": { 
         "type": "SqlSink", 
         "writeBatchSize": 0, 
         "writeBatchTimeout": "00:00:00" 
        }, 
        "translator": { 
         "type": "TabularTranslator", 
         "columnMappings": "All columns mapped here" 
        } 
       }, 
       "inputs": [ 
        { 
         "name": "Traffic CRM - SytemUser Stage" 
        } 
       ], 
       "outputs": [ 
        { 
         "name": "Smart App - System User Stage Production" 
        } 
       ], 
       "policy": { 
        "timeout": "1.00:00:00", 
        "concurrency": 1, 
        "executionPriorityOrder": "NewestFirst", 
        "style": "StartOfInterval", 
        "retry": 3, 
        "longRetry": 0, 
        "longRetryInterval": "00:00:00" 
       }, 
       "scheduler": { 
        "frequency": "Day", 
        "interval": 1 
       }, 
       "name": "Activity-0-[dbo]_[Traffic_Crm_SystemUser]->[dbo]_[Traffic_Crm_SystemUser]" 
      } 
     ], 
     "start": "2017-01-19T14:30:57.309Z", 
     "end": "2099-12-31T05:00:00Z", 
     "isPaused": false, 
     "hubName": "stagingdatafactory1_hub", 
     "pipelineMode": "Scheduled" 
    } 
} 

回答

2

你的SP活動的輸出數據集,即「名」:「智能應用 - USP截斷系統用戶」應該是下一個活動的輸入。如果您有要放什麼東西在數據集中的混亂,只需要創建一個虛擬數據集像下面

{ 
    "name": "DummySPDS", 
    "properties": { 
     "published": false, 
     "type": "SqlServerTable", 
     "linkedServiceName": "SQLServerLS", 
     "typeProperties": { 
      "tableName": "dummyTable" 
     }, 
     "availability": { 
      "frequency": "Hour", 
      "interval": 1 
     }, 
     "IsExternal":"True" 
    } 
} 

下面是完整的流水線代碼

{ 
    "name": "Traffic CRM - System User Stage", 
    "properties": { 
     "description": "Move System User to Stage", 
     "activities": [ 
      { 
       "type": "SqlServerStoredProcedure", 
       "typeProperties": { 
        "storedProcedureName": "dbo.usp_Truncate_Traffic_Crm_SystemUser", 
        "storedProcedureParameters": {} 
       }, 
       "inputs": [ 
        { 
         "name": "DummySPDS" 
        } 
       ], 
       "outputs": [ 
        { 
         "name": "Smart App - usp Truncate System User" 
        } 
       ], 
       "policy": { 
        "timeout": "01:00:00", 
        "concurrency": 1, 
        "retry": 3 
       }, 
       "scheduler": { 
        "frequency": "Day", 
        "interval": 1 
       }, 
       "name": "Smart App - SystemUser Truncate" 
      }, 
      { 
       "type": "Copy", 
       "typeProperties": { 
        "source": { 
         "type": "SqlSource", 
         "sqlReaderQuery": "select * from [dbo].[Traffic_Crm_SystemUser]" 
        }, 
        "sink": { 
         "type": "SqlSink", 
         "writeBatchSize": 0, 
         "writeBatchTimeout": "00:00:00" 
        }, 
        "translator": { 
         "type": "TabularTranslator", 
         "columnMappings": "All columns mapped here" 
        } 
       }, 
       "inputs": [ 
        { 
         "name": "Smart App - usp Truncate System User" 
        } 
       ], 
       "outputs": [ 
        { 
         "name": "Smart App - System User Stage Production" 
        } 
       ], 
       "policy": { 
        "timeout": "1.00:00:00", 
        "concurrency": 1, 
        "executionPriorityOrder": "NewestFirst", 
        "style": "StartOfInterval", 
        "retry": 3, 
        "longRetry": 0, 
        "longRetryInterval": "00:00:00" 
       }, 
       "scheduler": { 
        "frequency": "Day", 
        "interval": 1 
       }, 
       "name": "Activity-0-[dbo]_[Traffic_Crm_SystemUser]->[dbo]_[Traffic_Crm_SystemUser]" 
      } 
     ], 
     "start": "2017-01-19T14:30:57.309Z", 
     "end": "2099-12-31T05:00:00Z", 
     "isPaused": false, 
     "hubName": "stagingdatafactory1_hub", 
     "pipelineMode": "Scheduled" 
+0

我添加描述虛擬數據集,但是,那麼第二個活動失去了複製活動所需的映射。然後,我嘗試向'inputs'中添加第二個項目,但收到'duplicate object key referenced table name'錯誤,即使我的啞元數據集不包含相同的表名稱。這是我用來建議添加到輸入的第二個'name'對象的文章:http://stackoverflow.com/questions/35970079/azure-data-factory-multiple-activities-in-pipeline-execution-order – rcastagna

+0

我提供了完整的管道代碼,但未在Azure上進行測試,但應該可以工作。 – Manish