的Hadoop在Azure上，我可以使用I/O不同的Blob存儲容器？

我目前工作的一個項目，以創建Azure中的大數據架構。爲了瞭解Azure的作品中，我創建了一個數據工廠和Blob存儲，併成立了一個字一個流水線上的按需HDInsight系統計算Hadoop的過程。的Hadoop在Azure上，我可以使用I/O不同的Blob存儲容器？

這是管道JSON文件：

{ 
"name": "MRSamplePipeline5", 
    "properties": { 
     "description": "Sample Pipeline to Run the Word Count Program", 
     "activities": [ 
      { 
       "type": "HDInsightMapReduce", 
       "typeProperties": { 
        "className": "wordcount", 
        "jarFilePath": "executables/hadoop-example.jar", 
        "jarLinkedService": "AzureStorageLinkedService", 
        "arguments": [ 
         "/davinci.txt", 
         "/WordCountOutput1" 
        ] 
       }, 
       "outputs": [ 
        { 
         "name": "MROutput4" 
        } 
       ], 
       "policy": { 
        "timeout": "01:00:00", 
        "concurrency": 1, 
        "retry": 3 
       }, 
       "scheduler": { 
        "frequency": "Minute", 
        "interval": 15 
       }, 
       "name": "MRActivity", 
       "linkedServiceName": "HDInsightOnDemandLinkedService" 
      } 
     ], 
     "start": "2017-07-24T00:00:00Z", 
     "end": "2017-07-24T00:00:00Z", 
     "isPaused": false, 
     "hubName": "testazuredatafact_hub", 
     "pipelineMode": "OneTime", 
     "expirationTime": "3.00:00:00" 
    } 
}

它的工作，即使輸出是一個名爲「WordCountOutput1 /一部分-R-00000」的文件。

我的問題是：如何將輸入文件（davinci.txt）和輸出文件（Output1）定義在我的blob存儲的不同容器（例如「exampledata」）中？

來源

2017-07-24 Markus Appel

Hadoop的文件路徑可以在一個完整的URI語法，包括方案和權限來指定，在不同種類的文件系統的指向（例如HDFS與天青與S3），並且在特定情況下，不同的Azure存儲容器。 Azure存儲訪問的相關方案是「wasb」。該權限包含容器和帳戶。例如，請考慮以下hadoop fs -ls命令。

# WASB backed by container "test" in Azure Storage account "cnauroth" 
hadoop fs -ls wasb://[email protected]/users/cnauroth 

# WASB backed by container "qa" in Azure Storage account "cnauroth" 
hadoop fs -ls wasb://[email protected]/users/cnauroth 

# WASB backed by container "production" in Azure Storage account "cnauroth-live" 
hadoop fs -ls wasb://[email protected]/users/cnauroth

從同一客戶端主機執行的每個命令列出了不同的Azure存儲帳戶/容器。

參數傳遞給您的作業提交時，您可以使用相同的URI語法。

來源

2017-07-25 18:47:54

的Hadoop在Azure上，我可以使用I/O不同的Blob存儲容器？

回答

相關問題