用python中的另一個字符串替換單詞列表中的所有單詞

我有一個用戶輸入的字符串，我想搜索它並用替換字符串替換任何出現的單詞列表。用python中的另一個字符串替換單詞列表中的所有單詞

import re 

prohibitedWords = ["MVGame","Kappa","DatSheffy","DansGame","BrainSlug","SwiftRage","Kreygasm","ArsonNoSexy","GingerPower","Poooound","TooSpicy"] 


# word[1] contains the user entered message 
themessage = str(word[1])  
# would like to implement a foreach loop here but not sure how to do it in python 
for themessage in prohibitedwords: 
    themessage = re.sub(prohibitedWords, "(I'm an idiot)", themessage) 

print themessage

上面的代碼不起作用，我敢肯定我不明白python for循環是如何工作的。

來源

2013-03-27 Zac

你應該嘗試檢查出的蟒蛇spambayes實現可能更具可擴展性。 – dusual 2013-03-27 12:18:01

你可以做到這一點與一個調用sub：

big_regex = re.compile('|'.join(map(re.escape, prohibitedWords))) 
the_message = big_regex.sub("repl-string", str(word[1]))

例子：

>>> import re 
>>> prohibitedWords = ['Some', 'Random', 'Words'] 
>>> big_regex = re.compile('|'.join(map(re.escape, prohibitedWords))) 
>>> the_message = big_regex.sub("<replaced>", 'this message contains Some really Random Words') 
>>> the_message 
'this message contains <replaced> really <replaced> <replaced>'

注意，使用str.replace可能導致微妙的錯誤：

>>> words = ['random', 'words'] 
>>> text = 'a sample message with random words' 
>>> for word in words: 
...  text = text.replace(word, 'swords') 
... 
>>> text 
'a sample message with sswords swords'

同時使用re.sub給出正確的結果：

>>> big_regex = re.compile('|'.join(map(re.escape, words))) 
>>> big_regex.sub("swords", 'a sample message with random words') 
'a sample message with swords swords'

由於thg435指出，如果要更換話不是每個子串，你可以添加單詞邊界的正則表達式：

big_regex = re.compile(r'\b%s\b' % r'\b|\b'.join(map(re.escape, words)))

這會取代'random''random words'而不是'pseudorandom words'。

來源

2013-03-27 12:03:13 Bakuriu

你可以顯示一個運行 – 2013-03-27 12:03:51

但是，如果你有很多詞要替換，你將不得不打破它。 – DSM 2013-03-27 12:15:18

您可能希望將您的表達式放在'\ b'中以避免替換「零售商」中的「tail」。 – georg 2013-03-27 12:31:30

試試這個：

prohibitedWords = ["MVGame","Kappa","DatSheffy","DansGame","BrainSlug","SwiftRage","Kreygasm","ArsonNoSexy","GingerPower","Poooound","TooSpicy"] 

themessage = str(word[1])  
for word in prohibitedwords: 
    themessage = themessage.replace(word, "(I'm an idiot)") 

print themessage

來源

2013-03-27 12:00:03

這很脆弱：正如Bakuriu解釋的，當一個被禁止的單詞是另一個的子串時，它很容易中斷。 – Adam 2013-03-27 12:19:51

@codesparkle這並不意味着這是錯誤的，你總是選擇你的選擇取決於某些條件 – 2013-03-27 12:25:48

代碼：

prohibitedWords =["MVGame","Kappa","DatSheffy","DansGame", 
        "BrainSlug","SwiftRage","Kreygasm", 
        "ArsonNoSexy","GingerPower","Poooound","TooSpicy"] 
themessage = 'Brain' 
self_criticism = '(I`m an idiot)' 
final_message = [i.replace(themessage, self_criticism) for i in prohibitedWords] 
print final_message

結果：

['MVGame', 'Kappa', 'DatSheffy', 'DansGame', '(I`m an idiot)Slug', 'SwiftRage', 
'Kreygasm', 'ArsonNoSexy', 'GingerPower', 'Poooound','TooSpicy']

來源

2013-03-27 12:45:30 zen11625

用python中的另一個字符串替換單詞列表中的所有單詞

回答

相關問題