2017-02-25 64 views
0

這裏的錯誤消息:字典大小變化,爲什麼

RuntimeError: dictionary changed size during iteration

這裏是我的代碼段(< =標誌着錯誤行):

# Probability Distribution from a sequence of tuple tokens 
def probdist_from_tokens (tokens, N, V = 0, addone = False): 
    cfd = ConditionalFreqDist (tokens) 
    pdist = {} 

    for a in cfd: # <= line with the error 
     pdist[a] = {} 
     S = 1 + sum (1 for b in cfd[a] if cfd[a][b] == 1) 
     A = sum (cfd[a][b] for b in cfd[a]) 

     # Add the log probs. 
     for b in cfd[a]: 
      B = sum (cfd[b][c] for c in cfd[b]) 
      boff = ((B + 1)/(N + V)) if addone else (B/N) 
      pdist[a][b] = math.log ((cfd[a][b] + (S * boff))/(A + S)) 

     # Add OOV for tag if relevant 
     if addone: 
      boff = 1/(N + V) 
      pdist[a]["<OOV>"] = math.log ((S * boff)/(A + S)) 

    return pdist 

我基本上只是使用cfd作爲參考,將正確的值放在pdist中。我不想改變cfd,我只是想遍歷它的鍵和它的子字典的鍵。

我認爲問題是由設置變量A和B的行所引起的,當我在這些行上使用不同的代碼時遇到了同樣的錯誤,但是當我用常量替換它們時沒有得到錯誤值。

+0

您能提供一個獨立的示例來演示問題嗎? – BrenBarn

回答

1

nltk.probability.ConditionalFreqDist繼承defaultdict,這意味着如果你讀一個不存在的條目cfd[b],一個新的條目(b, FreqDist())將被插入到字典中,從而改變它的大小。問題的演示:

import collections 
d = collections.defaultdict(int, {'a': 1}) 
for k in d: 
    print(d['b']) 

輸出:

0 
Traceback (most recent call last): 
    File "1.py", line 4, in <module> 
    for k in d: 
RuntimeError: dictionary changed size during iteration 

所以,你應該檢查這行:

for b in cfd[a]: 
     B = sum (cfd[b][c] for c in cfd[b]) 

你確定的b關鍵確實存在於cfd?您可能需要將其更改爲

 B = sum(cfd[b].values()) if b in cfd else 0 
#        ^~~~~~~~~~~ 
相關問題