2017-02-16 83 views
0

我使用defaultdicts來存儲值的列表,其中keys是可以觀察到值的時間段。 當從感興趣的所有時期的列表中查找時,我想找到我的默認字典中最接近的時期(注意:並非所有時期都存儲在defaultdict中)。在defaultdict中查找最近的密鑰

由於defaultdicts沒有排序,但下面的方法不會返回正確的值。

是否有不同的方式返回defaultdicts最接近的可用鍵?

from collections import defaultdict 
import numpy as np 

def_dict = defaultdict(list) 
# entries that will be stored in the defaultdict 
reg_dict = {0: ["a", "b"], 2: ["c", "d"], 5: ["k", "h"], -3: ["i", "l"]} 

# store items from regular dict in defaultdict 
for k, v in reg_dict.items(): 
    def_dict[k] = v 

# Lookup periods 
periods = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8] 

for period in periods: 

    # this approach does not return the right keys as defaultdicts are not sorted 
    closest_key = np.abs(np.array(list(def_dict.keys())) - period).argmin() 

    print("period: ", period, " - looked up key: ", closest_key) 

這將返回以下:

period: -1 - looked up key: 0 
period: 0 - looked up key: 0 
period: 1 - looked up key: 0 
period: 2 - looked up key: 1 
period: 3 - looked up key: 1 
period: 4 - looked up key: 2 
period: 5 - looked up key: 2 
period: 6 - looked up key: 2 
period: 7 - looked up key: 2 
period: 8 - looked up key: 2 
+2

1)你並不真的需要一個'defaultdict',一個'OrderedDict'會的工作,和2你爲什麼不按鍵排序?你可以[編輯]你的帖子來顯示預期的輸出? –

+0

argmin返回密鑰,以便結果正確。如果你想要值,使用'min(closest_key)'。 –

回答

1

我明白的樣子,你想類似這樣的輸出?

[0, 0, 0, 2, 2, 5, 5, 5, 5, 5] 

針對上述情況,所述邏輯將是

closest_key = [min(def_dict.keys(), key = lambda x: abs(x - p)) for p in periods] 

指定可選的參數key內置在python功能是在這樣的情況下非常有用。

1

我同意你需要euqlidean距離@septra,但是這是可以實現的與numpy的還有:

from collections import defaultdict 
import numpy as np 

def_dict = defaultdict(list) 
# entries that will be stored in the defaultdict 
reg_dict = {0: ["a", "b"], 2: ["c", "d"], 5: ["k", "h"], -3: ["i", "l"]} 

# store items from regular dict in defaultdict 
for k, v in reg_dict.items(): 
    def_dict[k] = v 

periods = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8] 
a = list(def_dict.keys()) 
for period in periods: 
    closest_key = np.sqrt(np.power(np.add(a, -period),2)).argmin() 
    # OR closest_key = np.abs(np.add(a, -period)).argmin() 

    print("period: ", period, " - looked up key: ", a[closest_key]) 
2

隨着OrderedDict和分類鍵,你可以使用一個二進制搜索。 對於大量的鍵,查找將比您當前的方法快得多。

既然你想要最近的鍵,你需要找到低於x的最右邊的鍵和高於x的最左邊的鍵。在找到低於x的最右邊鍵的索引i後,另一個候選鍵(高於x的最左邊鍵)將在索引i+1上。

您需要確保這些索引仍然在您的數組中。

最後,你只需要計算從這兩個值到x的距離。

下面是bisectnp.searchsorted

1

正如埃裏克說,DOC,要做到這一點有效,你應該使用二進制搜索。但是,如果鍵的數量很少,簡單的線性搜索可能就足夠了。不需要使用defaultdict或OrderedDict,只需對鍵進行排序。

import numpy as np 

# entries 
reg_dict = {0: ["a", "b"], 2: ["c", "d"], 5: ["k", "h"], -3: ["i", "l"]} 

keys = np.array(sorted(reg_dict.keys())) 
print('keys', keys) 

# Lookup periods 
periods = np.arange(-1, 9) 

for period in periods: 
    closest_key = keys[np.abs(keys - period).argmin()] 
    print("period: ", period, " - looked up key: ", closest_key) 

輸出

keys [-3 0 2 5] 
period: -1 - looked up key: 0 
period: 0 - looked up key: 0 
period: 1 - looked up key: 0 
period: 2 - looked up key: 2 
period: 3 - looked up key: 2 
period: 4 - looked up key: 5 
period: 5 - looked up key: 5 
period: 6 - looked up key: 5 
period: 7 - looked up key: 5 
period: 8 - looked up key: 5