python：如何比較同一字典中的兩個以上的鍵？

我正在解析100個類似格式的文件。從文件中，我創建了一個字典，其中可能包含兩個鍵或兩個以上的鍵，其中的值位於一個集合中。無論如何，總會有一個包含'Y'值的關鍵字。對於該密鑰，我需要刪除其他密鑰中存在的任何重複值。python：如何比較同一字典中的兩個以上的鍵？

我有一個類似的問題，我只有兩個鍵，它被解決了。 Python: How to compare values of different keys in dictionary and then delete duplicates?

當字典有兩個鍵但不超過兩個時，下面的代碼工作正常。

for d, p in zip(temp_list, temp_search_list): 
    temp2[d].add(p) #dictionary with delvt and pin names for cell 
for test_d, test_p in temp2.items(): 
    if not re.search('Y', ' '.join(test_p)) : 
     tp = temp2[test_d] 
    else: 
     temp2[test_d] = [t for t in temp2[test_d] if t not in tp]

使用三個鍵但取決於解析文件的示例詞典我可以有更多鍵。

temp2 = {'0.1995': set(['X7:GATE', 'X3:GATE', 'IN1']), '0.199533': set(['X4:GATE', 'X8:GATE', 'IN2']), '0.399': set(['X3:GATE', 'X5:GATE', 'X1:GATE', 'IN0', 'X4:GATE', 'Y', 'X8:GATE'])}

預期輸出：

temp2 
{'0.1995': set(['X7:GATE', 'X3:GATE','IN1']), '0.199533': set(['X4:GATE', 'X8:GATE', 'IN2']), '0.399': set(['X5:GATE', 'X1:GATE', 'IN0', 'Y'])}

來源

2013-01-16 Jon A.

'任何（ 'Y' 中值用於test_p值）'是一個更好的方法來測試Y'的'的存在。 –

您需要將搜索中的Y值與搜索其餘數據分開。你真的想這樣做，當你已經建立temp2，以避免不必要的循環：

y_key = None 
for d, p in zip(temp_list, temp_search_list): 
    temp2[d].add(p) 
    if p == 'Y': 
     y_key = d

接下來，去除重複數據刪除值是最簡單的使用set.difference_update()改變就地集合：

y_values = temp2[y_key] 
for test_d, test_p in temp2.iteritems(): 
    if test_d == y_key: 
     continue 
    y_values.difference_update(test_p)

使用您的示例temp2，並假定y_key已在設置時設置爲temp2，第二個循環的結果爲：

>>> temp2 = {'0.1995': set(['X7:GATE', 'X3:GATE', 'IN1']), '0.199533': set(['X4:GATE', 'X8:GATE', 'IN2']), '0.399': set(['X3:GATE', 'X5:GATE', 'X1:GATE', 'IN0', 'X4:GATE', 'Y', 'X8:GATE'])} 
>>> y_key = '0.399' 
>>> y_values = temp2[y_key] 
>>> for test_d, test_p in temp2.iteritems(): 
...  if test_d == y_key: 
...   continue 
...  y_values.difference_update(test_p) 
... 
>>> temp2 
{'0.1995': set(['X7:GATE', 'X3:GATE', 'IN1']), '0.199533': set(['X4:GATE', 'X8:GATE', 'IN2']), '0.399': set(['X5:GATE', 'X1:GATE', 'IN0', 'Y'])}

請注意如何將X3:GATE,X4:GATE和X8:GATE的值從0.399集中刪除。

來源

2013-01-16 16:02:29

嗨@Martijn我實際上想要從包含'Y'的集合中刪除重複項。我會更新OP以顯示預期結果。 –

@JonA：這很簡單。已更新以反轉'difference_update（）'變量和生成的輸出。 –

你可以做整個事情只有1環，實際上必須遍歷整個數據集。

from collections import defaultdict 

target = None 
result = defaultdict(set) 
occurance_dict = defaultdict(int) 
# Loop over the inputs, building the result, counting the 
# number of occurances for each value as you go and marking 
# the key that contains 'Y' 
for key, value in zip(temp_list, temp_search_list): 
    # This is here so we don't count values twice if there 
    # is more than one instance of the value for the given 
    # key. If we don't do this, if a value only exists in 
    # the 'Y' set, but it occurs multiple times in the input, 
    # we would still filter it out later on. 
    if value not in result[key]: 
     occurance_dict[value] += 1 
     result[key].add(value) 
    if value == 'Y': 
     if target is None: 
      target = key 
     else: 
      raise ValueError('Dataset contains more than 1 entry containing "Y"') 
if target is None: 
    raise ValueError('Dataset contains no entry containing "Y"') 
# Filter the marked ('Y' containing) entry; if there is more than 
# 1 occurance of the given value, then it exists in another entry 
# so we don't want it in the 'Y' entry 
result[target] = {value for value in result[target] if occurance_dict[value] == 1}

是occurance_dict是很多相同的collections.Counter，但我寧願不通過數據集迭代兩次（即使是幕後發生的事情），如果我沒有，我們也AREN不計算同一個鍵的給定值的第二次發生。

來源

2013-01-16 16:01:40

我希望我能想到一個可愛的方式來做到這一點與列表解析和/或itertools模塊，但我不能。我會從類似的東西開始：

dict1 = {1: set([1,2,3,4,5]), 
     2: set([3,4,5,6]), 
     3: set([1,7,8,9]) 
     } 

list1 = dict1.items() 
newDict = {} 
for i in range(len(list1)): 
    (k1,set1) = list1[i] 
    newDict[k1] = set1 
    for j in range(i+1,len(list1)): 
     (k2, set2) = list1[j] 
     newDict[k2] = set2 - (set1 & set2) 

print newDict 
# {1: set([1, 2, 3, 4, 5]), 2: set([6]), 3: set([8, 9, 7])}

這可能不是超高效的，如果你有巨大的字典。

另一個想法：是否太長，以至於你不能只形成一個collection.Counter？你首先要通過字典去掉每組中的成員，並將它們粘在一個櫃檯上（可能與列表理解一致）。然後，通過originalDict.iteritems()循環。在一個新的字典中，可以插入其值爲原始集合的密鑰（即0.1995），進行過濾（如上所述，使用&，我認爲），以便它只包含計數> 0的計數器中的條目。對於插入到新字典中的所有元素，將它們從計數器中刪除（即使它們的計數大於1）。在一天結束時，你仍然需要循環兩次。

來源

2013-01-16 16:26:55 BenDundee

看起來很直截了當。首先找到在其值集中具有'Y'的密鑰，然後遍歷所有其他值集並將其從該組值中刪除。

temp2 = {'0.1995': set(['X7:GATE', 'X3:GATE', 'IN1']), 
     '0.199533':set(['X4:GATE', 'X8:GATE', 'IN2']), 
     '0.399': set(['X3:GATE', 'X5:GATE', 'X1:GATE', 'IN0', 'X4:GATE', 'Y', 'X8:GATE'])} 

y_key = None 
for k,v in temp2.iteritems(): 
    if 'Y' in v: 
     y_key = k 
     break 

if y_key is None: 
    print "no 'Y' found in values" 
    exit() 

result = {} 
for k,v in temp2.iteritems(): 
    if k != y_key: 
     temp2[y_key] -= v 

print 'temp2 = {' 
for k,v in temp2.iteritems(): 
    print ' {!r}: {!r},'.format(k,v) 
print '}'

輸出：

temp2 = { 
    '0.1995': set(['X7:GATE', 'X3:GATE', 'IN1']), 
    '0.199533': set(['X4:GATE', 'X8:GATE', 'IN2']), 
    '0.399': set(['X5:GATE', 'X1:GATE', 'IN0', 'Y']), 
}

來源

2013-01-16 17:40:03 martineau

python：如何比較同一字典中的兩個以上的鍵？

回答

相關問題