如何用Python中的字典搜索嵌套列表？

-1

在Python 3.6中，我有一個像下面這樣的列表，並且無法弄清楚如何正確搜索這些值。所以，如果我給了下面的搜索字符串，我需要搜索標題和標籤的值以及哪個匹配最多的值，我會返回id，如果有相同數量的許多不同圖像（id）的比賽，那麼標題首先按字母順序排列的人將被退回。另外，它應該不是區分大小寫的。所以在代碼中，我有搜索作爲我的術語來搜索，它應該返回第一個id值，而是返回不同的值。如何用Python中的字典搜索嵌套列表？

image_info = [ 
{ 
    "id" : "34694102243_3370955cf9_z", 
    "title" : "Eastern", 
    "flickr_user" : "Sean Davis", 
    "tags" : ["Los Angeles", "California", "building"] 
}, 
{ 
    "id" : "37198655640_b64940bd52_z", 
    "title" : "Spreetunnel", 
    "flickr_user" : "Jens-Olaf Walter", 
    "tags" : ["Berlin", "Germany", "tunnel", "ceiling"] 
}, 
{ 
    "id" : "34944112220_de5c2684e7_z", 
    "title" : "View from our rental", 
    "flickr_user" : "Doug Finney", 
    "tags" : ["Mexico", "ocean", "beach", "palm"] 
}, 
{ 
    "id" : "36140096743_df8ef41874_z", 
    "title" : "Someday", 
    "flickr_user" : "Thomas Hawk", 
    "tags" : ["Los Angeles", "Hollywood", "California", "Volkswagen", "Beatle", "car"] 
}

]

my_counter = 0 
search = "CAT IN BUILding" 
search = search.lower().split() 
matches = {} 

for image in image_info: 
    for word in search: 
     word = word.lower() 
     if word in image["title"].lower().split(" "): 
      my_counter += 1 
      print(my_counter) 
     if word in image["tags"]: 
      my_counter +=1 
      print(my_counter) 
    if my_counter > 0: 
     matches[image["id"]] = my_counter 
     my_counter = 0

來源

2017-10-13 Gray

什麼，當你說「返回」你的意思是？你沒有返回任何東西？你的預期產出是什麼，它與你擁有的產品有什麼不同？你能更明確嗎？ –

我運行了你的代碼，它給了我匹配詞典中的第一個ID。但是，標籤存在一個錯誤。您將搜索字符串中的單詞縮寫爲小寫，而不是標記中的單詞，但標記包含一些大寫的單詞。例如，你將無法匹配洛杉磯。 – bouma

@ juanpa.arrivillaga因此，我使用搜索項「CAT IN BUILTING」來搜索列表/字典中的標題和標記的值，並且我希望函數返回找到的匹配項。因此，對於「CAT IN BUILTING」，它應該返回1，並在34694102243_3370955cf9_z找到匹配的ID。如果搜索詞是「在墨西哥海灘建造」，那麼它應該返回34944112220_de5c2684e7_z，因爲它在標籤中有2個匹配項。 – Gray

這是一種代碼的變體，我試圖在搜索前預先對數據進行索引。這是一個非常基本的實現如何CloudSearch或ElasticSearch會索引和搜索

import itertools 
from collections import Counter 
image_info = [ 
{ 
    "id" : "34694102243_3370955cf9_z", 
    "title" : "Eastern", 
    "flickr_user" : "Sean Davis", 
    "tags" : ["Los Angeles", "California", "building"] 
}, 
{ 
    "id" : "37198655640_b64940bd52_z", 
    "title" : "Spreetunnel", 
    "flickr_user" : "Jens-Olaf Walter", 
    "tags" : ["Berlin", "Germany", "tunnel", "ceiling"] 
}, 
{ 
    "id" : "34944112220_de5c2684e7_z", 
    "title" : "View from our rental", 
    "flickr_user" : "Doug Finney", 
    "tags" : ["Mexico", "ocean", "beach", "palm"] 
}, 
{ 
    "id" : "36140096743_df8ef41874_z", 
    "title" : "Someday", 
    "flickr_user" : "Thomas Hawk", 
    "tags" : ["Los Angeles", "Hollywood", "California", "Volkswagen", "Beatle", "car"] 
} 
] 

my_counter = 0 
search = "CAT IN BUILding california" 
search = set(search.lower().split()) 
matches = {} 

index = {} 


# Building a rudimentary search index 
for info in image_info: 
    bag = info["title"].lower().split(" ") 
    tags = [t.lower().split(" ") for t in info["tags"]] # we want to be able to hit "los angeles" as will as "los" and "angeles" 
    tags = list(itertools.chain.from_iterable(tags)) 
    for k in (bag + tags): 
     if k in index: 
      index[k].append(info["id"]) 
     else: 
      index[k] = [info["id"]] 

#print(index) 

hits = [] 

for s in search: 
    if s in index: 
     hits += index[s] 
print(Counter(hits).most_common(1)[0][0])

來源

2017-10-13 04:34:35 djinn

如果我嘗試運行你提供的代碼，我得到錯誤：TypeError：append（）只需要一個參數（給定3）。 – Gray

謝謝@Mahi。我已更改代碼來解決問題。 – djinn

謝謝，這工作。但是，我有一個問題。現在它正在輸出所有圖像id和它的命中數量，但是如何才能打印出只有最大命中數量的圖像id而不是所有命中的圖像ID？ – Gray

您正在創建詞典匹配新條目[圖片[ 「ID」] = my_counter。如果您想在該字典中只保留1個條目，並且您希望image_id和count。我修改了你的字典和條件。希望能幫助到你。

my_counter = 0 
search_term = "CAT IN BUILding" 
search = search_term.lower().split() 
matches = {} 
matches[search_term] = {} 

for image in image_info: 
    for word in search: 
     word = word.lower() 
     if word in image["title"].lower().split(" "): 
      my_counter += 1 
      print(my_counter) 
     if word in image["tags"]: 
      my_counter +=1 
      print(my_counter) 
    if my_counter > 0: 
     if not matches[search_term].values() or my_counter > matches[search_term].values()[0]: 
      matches[search_term][image["id"]] = my_counter 

     my_counter = 0

來源

2017-10-13 04:28:42

我試着運行你修改過的代碼，現在得到錯誤：TypeError：' dict_values的對象不支持索引 – Gray

Python 3.4在執行dict.values（）時返回dict_values（）而不是列表。只需將list（）放在匹配[search_term] .values（）周圍。它應該像列表一樣（匹配[search_term] .values（））[0] –

也可以使用小寫列表標記，如上面的一個用戶突出顯示的那樣。 –

如何用Python中的字典搜索嵌套列表？

回答

相關問題