內存錯誤與函數調用大量參數

這是一種試圖被賦予的句子（種子庫）和詞的對字典的開始的話（對的列表之後創建一個亂碼聲明的程序），其中包含來自文本文件的關於哪些詞遵循的信息。

一個text.txt文件包含'This is a cat。'的例子。他是一隻狗。'將意味着我們會輸入以下內容：

seedBank = ['This', 'He'] 

pairs = { 'This':['is'],'is':['a','a'],'a':['cat','dog'],'He':['is'] }

因此該函數使用這些輸入來創建一個隨機生成的文章，使模糊的意識，因爲它遵循一個半語法正確的格式。

def gibberish_sentence(seedBank, pairs): 
    gibSentence = [] 
    gibSentence.append(random.choice(seedBank)) #random seed 
    x = gibSentence[0] 
    while(pairs.get(x)is not None): #Loop while value x is a key in the dictionairy 
     y = random.choice(pairs.get(x)) #random value of key x 
     gibSentence.append(y) #random value is added to main string 
     x = y #key x is reset to y 
    return ' '.join(gibSentence) #String

該問題：

這個程序能正常工作用於使小這樣的句子具有一組限定的random.seed（值）上面的一個，但是它不能與給定一組時返回一個存儲器錯誤輸入（seedBank和pair）非常大。我的問題是，這個程序有什麼問題可能會導致它在處理較大參數時遇到問題？

注意這些參數實際上並不是很大，我沒有文本文檔，但它不會太大以至於沒有足夠的內存。

錯誤代碼：

enter image description here

太謝謝你了。

已解決：謝謝！這個問題已經解決了，事實上這是造成問題的一個條件，這是因爲它循環遍歷整個文本，而不僅僅是當它到達一個具有完整或問號的單詞時結束。實質上，這導致它過載記憶，但謝謝大家幫忙！

來源

2015-03-19 Finn

多大的文本文件？在KB的順序？ MB？ GB？另外，我認爲我們需要看到調用代碼 - 我敢打賭，您意外地製作了佔用大量內存的副本。 – 2015-03-19 06:36:38

不幸的是，這是一個自動化的測試系統，但我通過電子郵件發送給我測試的人，所以我可以手動檢查它，我認爲這個問題可能與下面提到的無限循環有關，但我會考慮到這一點。該文本文件只有9.5KB，所以有些事情是非常錯誤的！ – Finn 2015-03-19 06:53:03

謝謝！問題得到了解決，事實上這是一種導致問題的條件，這是因爲它循環遍歷整個文本，而不僅僅是當它到達一個具有完全停頓或問號等的詞時結束。本質上，這導致它以超負荷的記憶，但謝謝大家在這裏幫助！ – Finn 2015-03-19 08:48:02

沒有實際pairs這很難說，但有一個無限循環的可能性，如果所有的話在某個時候相互引用：

pairs = { 'someone':['thinks'],'thinks':['that','how'],'that':['someone','anyone'],'how':['someone'], 'anyone': ['thinks'] }

寫不完。

來源

2015-03-19 06:37:48

這是一個有效的觀點，我沒有想到，因爲該函數只是想生成一個句子，它應該完成時，該單詞有一個。要麼？要麼！因爲它是最後一個字符，所以我可能需要爲此添加一個測試用例。不幸的是，這是自動測試系統的輸出，我自己沒有word_pairs_bank：/ 感謝您的回覆！ – Finn 2015-03-19 06:51:50

加入字符串列表並不是最差的，但它在空間效率方面並不是最好的。

考慮使用（當然是未經測試）類似StringIO：

from cStringIO import StringIO 
import random 

def gibberish_sentence(seedBank, pairs): 
    seed = random.choice(seedBank) 
    gibSentence = StringIO() 
    gibSentence.write(seed)    #random seed 
    gibSentence.write(' ') 
    x = seed 
    while(pairs.get(x) is not None): #Loop while value x is a key in the dictionairy 
     y = random.choice(pairs.get(x)) #random value of key x 
     gibSentence.write(y)   #random value is added to main string 
     gibSentence.write(' ') 
     x = y       #key x is reset to y 
    return gibSentence.getvalue() #String

不同的字符串連接方法Here's a comparison，以每秒和內存佔用操作方面。

來源

2015-03-19 06:43:58 jedwards

嗨，感謝您的迴應！對不起，我很新的Python語法，所以我真的只是學習繩索，我不認爲這個特殊問題是由串聯引起的，但感謝效率提示！ – Finn 2015-03-19 06:49:08

正如Tim Pietzcker所說，如果在pairs中有一個循環，您的代碼可以永久循環。這有一個最簡單的例子：

>>> seedBank = ['and'] 
>>> pairs = {'and': ['on'], 'on': ['and']} 
>>> gibberish_sentence(seedBank, pairs) # just keeps going

您可以確保您生成的句子（最終）通過修改pairs字典，使其包含這個詞的時候發生的最後一個句子中的一個哨兵值結束。例如用於源文本，如「你和我和狗。」：

seedBank = ['You'] 

pairs = { 
    'You': ['and'], 
    'and': ['me', 'the'], 
    'me': ['and'], 
    'the': ['dog'], 
    'dog': ['.'], 
}

...並增加在gibberish_sentence()爲定點檢查：

def gibberish_sentence(seedBank, pairs): 
    gibSentence = [] 
    gibSentence.append(random.choice(seedBank)) #random seed 
    x = gibSentence[0] 
    while(pairs.get(x)is not None): #Loop while value x is a key in the dictionairy 
     y = random.choice(pairs.get(x)) #random value of key x 
     if y == '.': 
      break 
     gibSentence.append(y) #random value is added to main string 
     x = y #key x is reset to y 
    return ' '.join(gibSentence) #String

...這給判決有機會終止：

>>> gibberish_sentence(seedBank, pairs) 
'You and the dog' 
>>> gibberish_sentence(seedBank, pairs) 
'You and me and me and me and me and me and the dog' 
>>> gibberish_sentence(seedBank, pairs) 
'You and me and the dog'

來源

2015-03-19 06:58:56

建立一個名單可以通過使用發電機，這是非常有效的內存來避免。

def gibberish_sentence(seedBank, pairs): 
    x = random.choice(seedBank)) #random seed 
    yield x 
    while(pairs.get(x)is not None): #Loop while value x is a key in the dictionairy 
     y = random.choice(pairs.get(x)) #random value of key x 
     yield y 
     x = y #key x is reset to y 

print ' '.join(gibberish_sentence(seedBank, pairs)) #String

或者字符串必須的功能，可以做這樣內建成，

def gibberish_sentence(seedBank, pairs): 
    def words(): 
     x = random.choice(seedBank)) #random seed 
     yield x 
     while(pairs.get(x)is not None): #Loop while value x is a key in the dictionairy 
     y = random.choice(pairs.get(x)) #random value of key x 
     yield y 
     x = y #key x is reset to y 
    return ' '.join(words()) #String

來源

2015-03-19 07:17:33

內存錯誤與函數調用大量參數

回答

相關問題