2011-12-05 60 views
11

我有這樣的代碼解析JSON,並通過它搜索

import json 
from pprint import pprint 
json_data=open('bookmarks.json') 
jdata = json.load(json_data) 
pprint (jdata) 
json_data.close() 

我如何通過它搜索u'uri': u'http:

回答

15

由於json.loads簡單地返回一個字典,你可以使用適用於類型的字典運營商:

>>> jdata = json.load('{"uri": "http:", "foo", "bar"}') 
>>> 'uri' in jdata  # Check if 'uri' is in jdata's keys 
True 
>>> jdata['uri']   # Will return the value belonging to the key 'uri' 
u'http:' 

編輯:給出關於如何通過數據環路的想法,考慮下面的例子:

>>> import json 
>>> jdata = json.loads(open ('bookmarks.json').read()) 
>>> for c in jdata['children'][0]['children']: 
...  print 'Title: {}, URI: {}'.format(c.get('title', 'No title'), 
              c.get('uri', 'No uri')) 
... 
Title: Recently Bookmarked, URI: place:folder=BOOKMARKS_MENU(...) 
Title: Recent Tags, URI: place:sort=14&type=6&maxResults=10&queryType=1 
Title: , URI: No uri 
Title: Mozilla Firefox, URI: No uri 

檢查jdata數據結構將允許您根據需要導航它。您撥打pprint已經是一個很好的起點。

編輯2:另一次嘗試。這會得到您在字典列表中提到的文件。有了這個,我認爲你應該能夠適應你的需求。

>>> def build_structure(data, d=[]): 
...  if 'children' in data: 
...   for c in data['children']: 
...    d.append({'title': c.get('title', 'No title'), 
...          'uri': c.get('uri', None)}) 
...    build_structure(c, d) 
...  return d 
... 
>>> pprint.pprint(build_structure(jdata)) 
[{'title': u'Bookmarks Menu', 'uri': None}, 
{'title': u'Recently Bookmarked', 
    'uri': u'place:folder=BOOKMARKS_MENU&folder=UNFILED_BOOKMARKS&(...)'}, 
{'title': u'Recent Tags', 
    'uri': u'place:sort=14&type=6&maxResults=10&queryType=1'}, 
{'title': u'', 'uri': None}, 
{'title': u'Mozilla Firefox', 'uri': None}, 
{'title': u'Help and Tutorials', 
    'uri': u'http://www.mozilla.com/en-US/firefox/help/'}, 
(...) 
}] 

要那麼「通過它搜索u'uri': u'http:',做這樣的事情:

for c in build_structure(jdata): 
    if c['uri'].startswith('http:'): 
     print 'Started with http' 
+0

它<說​​回溯(最近通話最後一個): 文件 「」,3號線,在 ValueError:當我嘗試啓動第二個示例時,格式爲零的長度字段名稱 – BKovac

+0

這可能與您導出的書籤的佈局有關......我不太瞭解格式,但我猜想它會爲您書籤中的每個文件夾或容器製作一個「兒童」鍵。例如,用'for c in jdata ['children']:'代替上述內容。另外,請注意''{}'。format()'函數在Python 2.6中是新的...您可能有一箇舊版本。如果是這樣,用'print'替換該行標題:%s,URI:%s'%(c.get('title','No title'),c.get('uri','No uri')) '。 – jro

+0

仍然不工作這裏是書籤文件http://pastebin.com/uCtECvDi – BKovac

0

您可以使用jsonpipe如果你只需要輸出(和更舒適的命令行):

cat bookmarks.json | jsonpipe |grep uri 
+0

jsonpipe鏈接似乎被改變或刪除 –

+0

@SureshPrajapati修復 – number5

3

ObjectPath是一個庫,提供查詢JSON和嵌套結構的能力o f字典和列表。例如,通過使用$..foo,您可以搜索名爲「foo」的所有屬性,而不管它們有多深。

儘管文檔側重於命令行界面,但您可以使用程序包的Python內部程序以編程方式執行查詢。下面的例子假設您已經將數據加載到Python數據結構中(數據庫&列表)。如果您以JSON文件或字符串開頭,則只需首先使用json module中的loadloads

import objectpath 

data = [ 
    {'foo': 1, 'bar': 'a'}, 
    {'foo': 2, 'bar': 'b'}, 
    {'NoFooHere': 2, 'bar': 'c'}, 
    {'foo': 3, 'bar': 'd'}, 
] 

tree_obj = objectpath.Tree(data) 

tuple(tree_obj.execute('$..foo')) 
# returns: (1, 2, 3) 

請注意,它只是跳過缺乏「富」屬性的元素,如列表中的第三項。你也可以做更復雜的查詢,這使ObjectPath對於深層嵌套結構來說非常方便(例如,找到x有y的那個z:$.x.y.z)。有關詳細信息,請參閱documentationtutorial

1

似乎Jro提供的JSON字典中存在拼寫錯誤(缺少冒號)。

正確的語法是: jdata = json.load( '{ 「URI」: 「HTTP:」, 「富」: 「酒吧」}')

這清除它適合我玩的時候與代碼。

0

函數來搜索和打印字符,如JSON。 在Python做* 3

搜索:

def pretty_search(dict_or_list, key_to_search, search_for_first_only=False): 
    """ 
    Give it a dict or a list of dicts and a dict key (to get values of), 
    it will search through it and all containing dicts and arrays 
    for all values of dict key you gave, and will return you set of them 
    unless you wont specify search_for_first_only=True 

    :param dict_or_list: 
    :param key_to_search: 
    :param search_for_first_only: 
    :return: 
    """ 
    search_result = set() 
    if isinstance(dict_or_list, dict): 
     for key in dict_or_list: 
      key_value = dict_or_list[key] 
      if key == key_to_search: 
       if search_for_first_only: 
        return key_value 
       else: 
        search_result.add(key_value) 
      if isinstance(key_value, dict) or isinstance(key_value, list) or isinstance(key_value, set): 
       _search_result = pretty_search(key_value, key_to_search, search_for_first_only) 
       if _search_result and search_for_first_only: 
        return _search_result 
       elif _search_result: 
        for result in _search_result: 
         search_result.add(result) 
    elif isinstance(dict_or_list, list) or isinstance(dict_or_list, set): 
     for element in dict_or_list: 
      if isinstance(element, list) or isinstance(element, set) or isinstance(element, dict): 
       _search_result = pretty_search(element, key_to_search, search_result) 
       if _search_result and search_for_first_only: 
        return _search_result 
       elif _search_result: 
        for result in _search_result: 
         search_result.add(result) 
    return search_result if search_result else None 

打印:

def pretty_print(dict_or_list, print_spaces=0): 
    """ 
    Give it a dict key (to get values of), 
    it will return you a pretty for print version 
    of a dict or a list of dicts you gave. 

    :param dict_or_list: 
    :param print_spaces: 
    :return: 
    """ 
    pretty_text = "" 
    if isinstance(dict_or_list, dict): 
     for key in dict_or_list: 
      key_value = dict_or_list[key] 
      if isinstance(key_value, dict): 
       key_value = pretty_print(key_value, print_spaces + 1) 
       pretty_text += "\t" * print_spaces + "{}:\n{}\n".format(key, key_value) 
      elif isinstance(key_value, list) or isinstance(key_value, set): 
       pretty_text += "\t" * print_spaces + "{}:\n".format(key) 
       for element in key_value: 
        if isinstance(element, dict) or isinstance(element, list) or isinstance(element, set): 
         pretty_text += pretty_print(element, print_spaces + 1) 
        else: 
         pretty_text += "\t" * (print_spaces + 1) + "{}\n".format(element) 
      else: 
       pretty_text += "\t" * print_spaces + "{}: {}\n".format(key, key_value) 
    elif isinstance(dict_or_list, list) or isinstance(dict_or_list, set): 
     for element in dict_or_list: 
      if isinstance(element, dict) or isinstance(element, list) or isinstance(element, set): 
       pretty_text += pretty_print(element, print_spaces + 1) 
      else: 
       pretty_text += "\t" * print_spaces + "{}\n".format(element) 
    else: 
     pretty_text += str(dict_or_list) 
    if print_spaces == 0: 
     print(pretty_text) 
    return pretty_text