提取單個JSON對象

我有從API獲取的下面的JSON文件。提取單個JSON對象

{"Key-1":"Value-1", 
"Key-2":[{"Value-2"::Child_Value-1","Value-3":"Child_Value-2"}] 
} 
{"Key-21":"Value-21", 
"Key-22":[{"Value-22":"Child_Value-21","Value-23":"Child_Value-22"}] 
} 
{"Key-31":"Value-31", 
"Key-32":[{"Value-32":"Child_Value-31","Value-33":"Child_Value-32"}] 
}

我明白，這不符合作爲JSON格式，但我想要實現的是，我想提取每一個單獨的對象，並將它們存儲在一個單獨的文件。

例如file1.json應該包含 -

[{"Key-1":"Value-1", 
    "Key-2":[{"Value-2":"Child_Value-1","Value-3":"Child_Value-2"}] 
    }]

和file2.json應該包含 -

[{"Key-21":"Value-21", 
    "Key-22":[{"Value-22":"Child_Value-21","Value-23":"Child_Value-22"}] 
    }]

我試圖通過Python和shell腳本來做到這一點，但它不是我領導的任何地方。在python/shell中有沒有好的庫可以提供幫助。我的語言種類的限制使用（巨蟒，shell腳本）

來源

2016-07-05 FirstName

據我所知，沒有解析破碎的JSON（缺少引號，而不是單個根數組/對象，...）的庫。 – jonrsharpe

如果JSON格式良好，您會發現Python的[JSON模塊]（https://docs.python.org/2/library/json.html）非常有用。此外，我會忘記shell腳本... – kazbeel

您需要一些方法來查找JSON各個位之間的界限。每個JSON總是3行嗎？這將是理想的。 – RemcoGerlich

這確實你的問題問什麼了（雖然我懷疑它實際上不是你想要的）

filecount = 0 
newfilecontents = '' 

with open('junksrc.txt', mode='r', encoding='utf-8') as src: 
    srclines = src.readlines() 
    for line in srclines: 
     if '{"Key' in line: 
      newfilecontents = '[' + line 
     if '}]' in line: 
      newfilecontents = newfilecontents + ' ' + line + ' }]\n' 
      filecount += 1 
      filename = 'junkdest' + str(filecount) + '.json' 
      with open(filename, mode='w', encoding='utf-8') as dest: 
       dest.write(newfilecontents)

來源

2016-07-05 11:35:23 jwpfox

這是一件非常緩慢的事情，並不具備處理數據錯誤的能力，但它可能有效。它是一個生成器，它發現第一個'{'，然後是下一個'}'，並嘗試將它們之間的位解析爲JSON。如果失敗，它會查找下一個'}'並再次嘗試。它產生成功解析的比特。

import json 

def generate_json_dictionaries(s): 
    opening = s.find('{') 
    while opening != -1: 
     possible_closing = opening 
     while True: 
      possible_closing = s.find('}', start=possible_closing+1) 
      if possible_closing == -1: return # Data incomplete 
      try: 
       j = json.loads(s[opening:possible_closing+1]) 
       yield j 
       break 
      except ValueError: 
       pass 
     opening = s.find('{', start=possible_closing+1) # Next start

未測試。

來源

2016-07-05 11:41:10 RemcoGerlich

如果你jq，您可以進行預處理的數據成容易被標準庫的JSON解析器解析的形式：

$ jq -s '.' tmp.json 
[ 
    { 
    "Key-1": "Value-1", 
    "Key-2": [ 
     { 
     "Value-2": "Child_Value-1", 
     "Value-3": "Child_Value-2" 
     } 
    ] 
    }, 
    { 
    "Key-21": "Value-21", 
    "Key-22": [ 
     { 
     "Value-22": "Child_Value-21", 
     "Value-23": "Child_Value-22" 
     } 
    ] 
    }, 
    { 
    "Key-31": "Value-31", 
    "Key-32": [ 
     { 
     "Value-32": "Child_Value-31", 
     "Value-33": "Child_Value-32" 
     } 
    ] 
    } 
]

jq可以識別有效的頂層對象流，你有這裏。 -s選項告訴jq在進一步處理之前將它們全部放入單個頂級數組中。

來源

2016-07-05 13:46:10 chepner

這很有幫助。謝謝！有沒有辦法給你用jq命令創建的數組創建一個名稱？ jq是否有這種操作的一些附加功能？ – FirstName

我不確定你的意思是「給它一個名字」。使用它的一種方式是將其管理到您的Python腳本中，並使用'json.load'從標準輸入中讀取：'jq -s。 TMP。json | python -c'import sys，json; x = json.load（sys.stdin）; ...'' – chepner

我的意思是通過命名數組是指您使用jq -s命令創建的數組 - 「單個頂級數組」。你提供的例子創建了一個沒有名字的頂級數組。我想知道是否有選項用名稱創建。 – FirstName

提取單個JSON對象

回答

相關問題