使用Soundex替換單詞，python

我有一個句子列表，基本上我的目標是用形式爲「opp，nr，off，abv，behnd」的介詞替換所有不同的介詞與他們正確的拼寫「相反，上面，後面「等。這些單詞的soundex代碼是相同的，所以我需要建立一個表達式來逐字地遍歷這個列表，如果soundex是相同的，用正確的拼寫替換它。使用Soundex替換單詞，python

一個例子 - [傑克正站在NR樹「，
「他計劃他們ABV一切」，
「就站在奧普櫃檯」，
「去twrds加油站」]

所以我需要用他們的完整形式來替換單詞nr，abv，opp和twrds。朝向和twrds的soundex代碼是相同的，所以它應該被替換。
我需要遍歷這個列表..
這裏的同音算法：

import string 

allChar = string.uppercase + string.lowercase 
charToSoundex = string.maketrans(allChar, "91239129922455912623919292" * 2) 

def soundex(source): 
    "convert string to Soundex equivalent" 

    # Soundex requirements: 
    # source string must be at least 1 character 
    # and must consist entirely of letters 
    if (not source) or (not source.isalpha()): 
    return "0000" 

    # Soundex algorithm: 
    # 1. make first character uppercase 
    # 2. translate all other characters to Soundex digits 
    digits = source[0].upper() + source[1:].translate(charToSoundex) 

    # 3. remove consecutive duplicates 
    digits2 = digits[0] 
    for d in digits[1:]: 
     if digits2[-1] != d: 
      digits2 += d 

    # 4. remove all "9"s 
    # 5. pad end with "0"s to 4 characters 
    return (digits2.replace('9', '') + '000')[:4] 

if __name__ == '__main__': 
    import sys 
    if sys.argv[1:]: 
     print soundex(sys.argv[1]) 
    else: 
    from timeit import Timer 
    names = ('Woo', 'Pilgrim', 'Flingjingwaller') 
    for name in names: 
     statement = "soundex('%s')" % name 
     t = Timer(statement, "from __main__ import soundex") 
     print name.ljust(15), soundex(name), min(t.repeat())

是新手，所以如果有另一種方法你可以建議，我們將不勝感激..謝謝。

來源

2014-02-07 Hypothetical Ninja

你能解決您的縮進？ – Bach

固定:)。而且，我是否應該創建自己的文件，包含正確的拼寫？ –

這不是固定的。從'def'到'return'應該縮進。 – Bach

我將使用附魔模塊：

import enchant 
d = enchant.Dict("en_US") 

phrase = ['Jack was standing nr the tree' , 
'they were abv everything he planned' , 
'Just stand opp the counter' , 
'Go twrds the gas station'] 

output = [] 
for section in phrase: 
    sect = '' 
    for word in section.split(): 
     if d.check(word): 
      sect += word + ' ' 
     else: 
      for correct_word in d.suggest(word): 
       if soundex(correct_word) == soundex(word): 
        sect += correct_word + ' ' 
    output.append(sect[:-1])

來源

2014-10-03 09:16:42 Hrabal

使用Soundex替換單詞，python

回答

相關問題