3
我目前正在爲某些ETL作業使用Pentaho Kettle,並且需要集成JSON提要,這意味着我需要使用JSONPath抓取數據。大多數情況下,它運行良好,除了某些JSON數據嵌套在父對象和子對象中具有相同字段名稱的對象。當我只想要父對象時,JSONPath查詢返回嵌套對象中的相同名稱的鍵
例JSON:
[
{
"Key": "5e59d536-2e3c-487c-bff1-efd0a706532f",
"Product": {
"Name": "Some Product",
"LastUpdated": "2013-08-23T12:10:25.454",
},
"Reviewer": {
"Email": "[email protected]",
"LastUpdated": "2013-08-23T12:10:25.454",
},
"LastUpdated": "2013-08-23T12:10:25.407",
},
{
"Key": "f3ae6a4b-1a20-4a9a-9a8e-2de5949c4493",
"Product": {
"Name": "Some Product",
"LastUpdated": "2013-08-23T12:10:51.896",
},
"Reviewer": {
"Email": "[email protected]",
"LastUpdated": "2013-08-23T12:10:51.896",
},
"LastUpdated": "2013-08-23T12:10:51.896",
},
{
"Key": "de01c358-6c74-473c-8cd4-a44cf50132df",
"Product": {
"Name": "Some Product",
"LastUpdated": "2013-08-26T10:30:13.617",
},
"Reviewer": {
"Email": "[email protected]",
"LastUpdated": "2013-08-26T10:30:13.617",
},
"LastUpdated": "2013-08-26T10:30:13.601",
},
},
{
"Key": "af04e48a-3ce8-4227-a00a-14483ca75058",
"Product": {
"Name": "Some Product",
"LastUpdated": "2013-08-26T10:31:20.573",
},
"Reviewer": {
"Email": "[email protected]",
"LastUpdated": "2013-08-26T10:31:20.573",
},
"LastUpdated": "2013-08-26T10:31:20.573",
},
{
"Key": "d1a787bb-37d2-4ea9-84fd-5a3d454b9127",
"Product": {
"Name": "Some Product",
"LastUpdated": "2013-08-27T11:59:56.777",
},
"Reviewer": {
"Email": "[email protected]",
"LastUpdated": "2013-08-27T11:59:56.777",
},
"LastUpdated": "2013-08-27T11:59:56.73",
},
{
"Key": "d8646319-af27-464f-bd50-d61e035800c6",
"Product": {
"Name": "Some Product",
"LastUpdated": "2013-08-27T19:43:06.928",
},
"Reviewer": {
"Email": "[email protected]",
"LastUpdated": "2013-08-27T19:43:06.928",
},
"LastUpdated": "2013-08-27T19:43:06.866",
},
]
正如你所看到的,父對象及其子對象的 「產品」 和 「審閱」 都有 「LASTUPDATED」 領域。我試圖讓父對象的唯一的「LASTUPDATED」,但使用:
$..LastUpdated
回報,從而,父LASTUPDATED,產品LASTUPDATED,然後審閱LASTUPDATED。
結果:
[
"2013-08-23T12:10:25.407",
"2013-08-23T12:10:25.454",
"2013-08-23T12:10:25.454",
"2013-08-23T12:10:51.896",
"2013-08-23T12:10:51.896",
"2013-08-23T12:10:51.896",
"2013-08-26T10:30:13.601",
"2013-08-26T10:30:13.617",
"2013-08-26T10:30:13.617",
"2013-08-26T10:31:20.573",
"2013-08-26T10:31:20.573",
"2013-08-26T10:31:20.573",
"2013-08-27T11:59:56.73",
"2013-08-27T11:59:56.777",
"2013-08-27T11:59:56.777",
"2013-08-27T19:43:06.866",
"2013-08-27T19:43:06.928",
"2013-08-27T19:43:06.928"
]
預計業績:
[
"2013-08-23T12:10:25.407",
"2013-08-23T12:10:51.896",
"2013-08-26T10:30:13.601",
"2013-08-26T10:31:20.573",
"2013-08-27T11:59:56.73",
"2013-08-27T19:43:06.866",
]
有一個查詢,我可以使用只獲取父對象LASTUPDATED領域?
你是怎麼設法得到這個工作?我所得到的只是「我們無法找到路徑[」的任何數據。唯一有效的是$ .. fieldName,但是這給我帶來了和你一樣的麻煩 - 它從嵌套元素中獲取數據。 – lukfi
@lukfi:老實說,我不記得了。去年我花了幾個月在Python3中重寫我的整個ETL,並徹底拋棄了Kettle。對不起,我無法提供更多幫助。當時我明白的幫助我找到了一個體面的JSONPath測試器。 –
謝謝你的回覆,克里斯。我不知道JsonPath測試人員是否存在。現在我看到你的答案是正確的,但壺5.1已經壞了。你有沒有記得你使用過哪種版本的水壺? – lukfi