2013-08-28 72 views
3

我目前正在爲某些ETL作業使用Pentaho Kettle,並且需要集成JSON提要,這意味着我需要使用JSONPath抓取數據。大多數情況下,它運行良好,除了某些JSON數據嵌套在父對象和子對象中具有相同字段名稱的對象。當我只想要父對象時,JSONPath查詢返回嵌套對象中的相同名稱的鍵

例JSON:

[ 
    { 
    "Key": "5e59d536-2e3c-487c-bff1-efd0a706532f", 
    "Product": { 
     "Name": "Some Product", 
     "LastUpdated": "2013-08-23T12:10:25.454", 
    }, 
    "Reviewer": { 
     "Email": "[email protected]", 
     "LastUpdated": "2013-08-23T12:10:25.454", 
    }, 
    "LastUpdated": "2013-08-23T12:10:25.407", 
    }, 
    { 
    "Key": "f3ae6a4b-1a20-4a9a-9a8e-2de5949c4493", 
    "Product": { 
     "Name": "Some Product", 
     "LastUpdated": "2013-08-23T12:10:51.896", 
    }, 
    "Reviewer": { 
     "Email": "[email protected]", 
     "LastUpdated": "2013-08-23T12:10:51.896", 
    }, 
    "LastUpdated": "2013-08-23T12:10:51.896", 
    }, 
    { 
    "Key": "de01c358-6c74-473c-8cd4-a44cf50132df", 
    "Product": { 
     "Name": "Some Product", 
     "LastUpdated": "2013-08-26T10:30:13.617", 
    }, 
    "Reviewer": { 
     "Email": "[email protected]", 
     "LastUpdated": "2013-08-26T10:30:13.617", 
    }, 
    "LastUpdated": "2013-08-26T10:30:13.601", 
    }, 
    }, 
    { 
    "Key": "af04e48a-3ce8-4227-a00a-14483ca75058", 
    "Product": { 
     "Name": "Some Product", 
     "LastUpdated": "2013-08-26T10:31:20.573", 
    }, 
    "Reviewer": { 
     "Email": "[email protected]", 
     "LastUpdated": "2013-08-26T10:31:20.573", 
    }, 
    "LastUpdated": "2013-08-26T10:31:20.573", 
    }, 
    { 
    "Key": "d1a787bb-37d2-4ea9-84fd-5a3d454b9127", 
    "Product": { 
     "Name": "Some Product", 
     "LastUpdated": "2013-08-27T11:59:56.777", 
    }, 
    "Reviewer": { 
     "Email": "[email protected]", 
     "LastUpdated": "2013-08-27T11:59:56.777", 
    }, 
    "LastUpdated": "2013-08-27T11:59:56.73", 
    }, 
    { 
    "Key": "d8646319-af27-464f-bd50-d61e035800c6", 
    "Product": { 
     "Name": "Some Product", 
     "LastUpdated": "2013-08-27T19:43:06.928", 
    }, 
    "Reviewer": { 
     "Email": "[email protected]", 
     "LastUpdated": "2013-08-27T19:43:06.928", 
    }, 
    "LastUpdated": "2013-08-27T19:43:06.866", 
    }, 
] 

正如你所看到的,父對象及其子對象的 「產品」 和 「審閱」 都有 「LASTUPDATED」 領域。我試圖讓父對象的唯一的「LASTUPDATED」,但使用:

$..LastUpdated 

回報,從而,父LASTUPDATED,產品LASTUPDATED,然後審閱LASTUPDATED。

結果:

[ 
    "2013-08-23T12:10:25.407", 
    "2013-08-23T12:10:25.454", 
    "2013-08-23T12:10:25.454", 
    "2013-08-23T12:10:51.896", 
    "2013-08-23T12:10:51.896", 
    "2013-08-23T12:10:51.896", 
    "2013-08-26T10:30:13.601", 
    "2013-08-26T10:30:13.617", 
    "2013-08-26T10:30:13.617", 
    "2013-08-26T10:31:20.573", 
    "2013-08-26T10:31:20.573", 
    "2013-08-26T10:31:20.573", 
    "2013-08-27T11:59:56.73", 
    "2013-08-27T11:59:56.777", 
    "2013-08-27T11:59:56.777", 
    "2013-08-27T19:43:06.866", 
    "2013-08-27T19:43:06.928", 
    "2013-08-27T19:43:06.928" 
] 

預計業績:

[ 
    "2013-08-23T12:10:25.407", 
    "2013-08-23T12:10:51.896", 
    "2013-08-26T10:30:13.601", 
    "2013-08-26T10:31:20.573", 
    "2013-08-27T11:59:56.73", 
    "2013-08-27T19:43:06.866", 
] 

有一個查詢,我可以使用只獲取父對象LASTUPDATED領域?

回答

1

終於想通了:

$[*].LastUpdated -> only the parents 
$[*].Product.LastUpdated -> only the product 
$[*].Reviewer.LastUpdated -> only the reviewer 
+0

你是怎麼設法得到這個工作?我所得到的只是「我們無法找到路徑[」的任何數據。唯一有效的是$ .. fieldName,但是這給我帶來了和你一樣的麻煩 - 它從嵌套元素中獲取數據。 – lukfi

+0

@lukfi:老實說,我不記得了。去年我花了幾個月在Python3中重寫我的整個ETL,並徹底拋棄了Kettle。對不起,我無法提供更多幫助。當時我明白的幫助我找到了一個體面的JSONPath測試器。 –

+0

謝謝你的回覆,克里斯。我不知道JsonPath測試人員是否存在。現在我看到你的答案是正確的,但壺5.1已經壞了。你有沒有記得你使用過哪種版本的水壺? – lukfi