我想在NiFi中創建一個流,該流需要一個有效的json文件並使用PutHiveStreaming處理器將其直接放入配置單元表中。我的JSON看起來像下面這樣:需要幫助推斷NiFi中的json文件的avro模式
{
"Raw_Json": {
"SystemInfo": {
"Id": "a string ID",
"TM": null,
"CountID": "a string ID",
"Topic": null,
"AccountID": "some number",
"StationID": "some number",
"STime": "some Timestamp",
"ETime": "some Timestamp"
},
"Profile": {
"ID": "ID number",
"ProductID": "Some Number",
"City": "City Name",
"State": "State Name",
"Number": "XXX-XXX-XXXX",
"ExtNumber": null,
"Unit": null,
"Name": "Person Name",
"Service": "Purchase",
"AddrID": "00000000",
"Products": {
"Product": [{
"Code": "CODE",
"Description": "some description"
},
{
"Code": "CODE",
"Description": "some description"
},
{
"Code": "CODE",
"Description": "some description"
},
{
"Code": "CODE",
"Description": "some description"
},
{
"Code": "CODE",
"Description": "some description"
},
{
"Code": "CODE",
"Description": "some description"
},
{
"Code": "CODE",
"Description": "some description"
},
{
"Code": "CODE",
"Description": "some description"
},
{
"Code": "CODE",
"Description": "some description"
},
{
"Code": "CODE",
"Description": "some description"
}]
}
},
"Total": {
"Amount": "some amount",
"Delivery": "some address",
"Estimate": "some amount",
"Tax": null,
"Delivery_Type": null
}
},
"partition_date":"2017-05-19"
}
我得到的JSON,使用InferAvroSchema處理器和使用推斷的Avro模式有轉換JSON來的Avro格式,並把它發送到PutHiveStreaming處理器。我的流程看起來是這樣的:
主要目標是,我想所有的「Raw_Json」列在蜂巢表被倒入一列,該表將被「partition_date進行分區「列將成爲表格的第二列。問題是,由於某種原因NiFi是有推斷從「Raw_Json」列中的嵌套JSON的問題,如下圖所示就像在桌子空爲之傾倒:
有誰知道我怎麼可能做出NiFi讀取「Raw_Json」列的整個嵌套Json作爲一列並將其發送到配置單元表?我如何創建自己的avro模式來執行此操作?任何有關如何解決這個問題的見解或想法將不勝感激!
非常感謝,我能夠通過使用EvaluateJsonPath和AttributesToJson處理器將它轉換爲字符串列,現在它工作正常。 –