Python清理一個句子中的單詞

我想寫一個接受一個字符串（句子）的函數，然後清理它並返回所有的字母，數字和一個henn。但是代碼似乎錯誤。請知道我在這裏做錯了什麼。Python清理一個句子中的單詞

例子：布雷克杜澤是d0噸
應返回：！布雷克杜澤是d0t

的Python：

def remove_unw2anted(str): 
    str = ''.join([c for c in str if c in 'ABCDEFGHIJKLNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890\'']) 
    return str 

def clean_sentence(s): 
    lst = [word for word in s.split()] 
    #print lst 
    for items in lst: 
     cleaned = remove_unw2anted(items) 
    return cleaned 

s = 'Blake D\'souza is an !d!0t' 
print clean_sentence(s)

來源

2013-02-02 Prem Minister

您可以使用'string.letters + string.digits'，而不是那個長字符串。 –

@Ashwini - 我還需要一些像hypen這樣倖免的符號，有沒有一個技巧呢？ –

'allowed_chars = string.letters + string.digits +'-''就夠了。 –

你只能回到最後一次清潔字！

應該是：

def clean_sentence(s): 
    lst = [word for word in s.split()] 

    lst_cleaned = [] 
    for items in lst: 
     lst_cleaned.append(remove_unw2anted(items)) 
    return ' '.join(lst_cleaned)

更短的方法可能是這樣的：

def is_ok(c): 
    return c.isalnum() or c in " '" 

def clean_sentence(s): 
    return filter(is_ok, s) 

s = "Blake D'souza is an !d!0t" 
print clean_sentence(s)

來源

2013-02-02 15:35:21 Don

非常感謝你！正如你所看到的，我對Python非常陌生，並且僅在幾周前纔開始。 –

使用string.translate其中有利益的變化？易於擴展，並且是string的一部分。

import string 

allchars = string.maketrans('','') 

tokeep = string.letters + string.digits + '-' 

toremove = allchars.translate(None, tokeep) 

s = "Blake D'souza is an !d!0t" 

print s.translate(None, toremove)

輸出：

BlakeDsouzaisand0t

的任擇議定書表示只保留字母，數字和連字符 - 也許他們的意思保持空白呢？

來源

2013-02-02 19:14:09 sotapme

Python清理一個句子中的單詞

回答

相關問題