2016-03-10 93 views
6

我試圖使用USQL從JSON文件中提取數據。查詢成功運行而不產生任何輸出數據或導致「頂點失敗的快速錯誤」。U-SQL無法從JSON文件中提取數據

JSON文件看起來像:

{ 
    "results": [ 
    { 
     "name": "Sales/Account", 
     "id": "7367e3f2-e1a5-11e5-80e8-0933ecd4cd8c", 
     "deviceName": "HP", 
     "deviceModel": "g6-pavilion", 
     "clientip": "0.41.4.1" 
    }, 
    { 
     "name": "Sales/Account", 
     "id": "c01efba0-e0d5-11e5-ae20-af6dc1f2c036", 
     "deviceName": "acer", 
     "deviceModel": "veriton", 
     "clientip": "10.10.14.36" 
    } 
    ] 
} 

我的U型SQL腳本

REFERENCE ASSEMBLY [Newtonsoft.Json]; 
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats]; 

DECLARE @in string="adl://xyz.azuredatalakestore.net/todelete.json"; 

DECLARE @out string="adl://xyz.azuredatalakestore.net/todelete.tsv"; 

@trail2=EXTRACT results string FROM @in USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor(); 

@jsonify=SELECT Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(results,"name","id","deviceName","deviceModel","clientip") AS rec FROM @trail2; 

@logSchema=SELECT rec["name"] AS sysName, 
       rec["id"] AS sysId, 
       rec["deviceName"] AS domainDeviceName, 
       rec["deviceModel"] AS domainDeviceModel, 
       rec["clientip"] AS domainClientIp 
     FROM @jsonify; 

OUTPUT @logSchema TO @out USING Outputters.Tsv(); 

回答

8

其實JSONExtractor支持JSONPath表示rowpath參數,讓您識別JSON對象或要映射到行JSON數組項的能力。所以你可以從你的JSON文件中用一條語句提取你的數據:

@logSchema = 
    EXTRACT name string, id string, deviceName string, deviceModel string, clientip string 
    FROM @input 
    USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor("results[*]"); 
0

薩拉特,

的問題是,您的@ TRAIL2輸出是一個JSON數組「據我所知,JsonFunction無法解析[{...},{...}]。所以我將它輸出到一個文件中,並用輸入器重新讀取它,它可以解析數組。

REFERENCE ASSEMBLY [Newtonsoft.Json]; 
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats]; 

DECLARE @in string="adl://xyz.azuredatalakestore.net/todelete.json"; 
DECLARE @out string="adl://xyz.azuredatalakestore.net/todelete.tsv"; 
DECLARE @mid string="adl://xyz.azuredatalakestore.net/intermediate.txt"; 


@trail2=EXTRACT results string FROM @in USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor(); 

OUTPUT @trail2 TO @mid USING Outputters.Text(quoting:false); 

@jsonify=EXTRACT name string, 
       id string, 
       deviceName string , 
       deviceModel string, 
       clientip string 
FROM @mid USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor(); 

@logSchema=SELECT name AS sysName, 
       id AS sysId, 
       deviceName AS domainDeviceName, 
       deviceModel AS domainDeviceModel, 
       clientip AS domainClientIp 
     FROM @jsonify; 

OUTPUT @logSchema TO @out USING Outputters.Tsv(); 
+0

謝謝邁克爾,那解決了這個問題。 –

+0

無需中間文件(實際上需要您提交兩個作業,因爲腳本無法讀取它創建的數據),您可以更高效地完成此操作。看到我的替代答案。 –