Python 2.7版刪除特殊CHAC，間隙，但不是漢字

string = "Special $#! characters spaces 888323 Kek ཌི ༜ 郭 ༜ དྀ "

的結果應該是： 「Specialcharactersspaces888323Kek郭」Python 2.7版刪除特殊CHAC，間隙，但不是漢字

我有
print ''.join(c for c in string.decode('utf-8') if u'\u4e00' <= c <= u'\u9fff')

但錯誤返回
嘗試 Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python27\lib\encodings\utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeEncodeError: 'ascii' codec can't encode character u'\u90ed' in position 4 9: ordinal not in range(128)

我的問題是一樣的標題，
刪除特殊CHAC，間隙，但不是漢字

來源

2017-02-05 John Walker

該解決方案使用re.compile和re.sub功能：

import re 

string = "Special $#! characters spaces 888323 Kek ཌི ༜ 郭 ༜ དྀ " 

# defining the pattern which should match all characters excepting alphanumeric and chinese 
pattern = re.compile(u'[^a-z0-9⺀-⺙⺛-⻳⼀-⿕々〇〡-〩〸-〺〻㐀-䶵一-鿃豈-鶴侮-頻並-龎]', re.UNICODE | re.IGNORECASE) 
result = pattern.sub('', string) 

# print(result) Python v.3 printing 
print result

輸出：

Specialcharactersspaces888323Kek郭

來源

2017-02-05 11:57:11 RomanPerekhrest

如何關於我不想刪除像'$＃！？<>'@RomanPerekhrest –

這樣的異常特殊問題！@＃$％^＆*（）：「<> /」。 –

@ChinYe，根據你的「例外列表」顯示預期結果 – RomanPerekhrest

Python 2.7版刪除特殊CHAC，間隙，但不是漢字

回答

相關問題