如何在Python中替換unicode中文字符？

說我有這樣的如何在Python中替換unicode中文字符？

example = u"這是一段很蛋疼的中文"

我想更換蛋與egg一個字符串，我怎麼能完成呢？

看來example.replace()是無用的。而我試過正則表達式，使用re.match(u"蛋", "")返回無。

我搜索了很多，看來我應該使用像.decode這樣的方法，但它仍然不起作用，即使example.replace(u"\u86CB", "egg")也沒用。

那麼有沒有辦法處理漢字？

來源

2017-05-29 JiangFeng

你使用哪個版本的Python？ – Vej

它工作正常（我使用Python3.5）。替換函數不會更改原始字符串。如果你想改變原始字符串，你應該使用'example = example.replace（u'蛋'，'egg'）'。 – TsReaper

如果你還沒有使用它，你應該切換到Python 3. – Ryan

你應該得到的輸出如下面Python3。

>>> import re 
>>> example = u"這是一段很蛋疼的中文" 
>>> re.search(u'蛋',example) 
<_sre.SRE_Match object; span=(5, 6), match='蛋'> 

>>> example.replace('蛋','egg') 
'這是一段很egg疼的中文' 
>>> re.sub('蛋','egg',example) 
'這是一段很egg疼的中文' 

>>> example.replace(u"\u86CB", "egg") 
'這是一段很egg疼的中文' 
>>> re.match('.*蛋',example) 
<_sre.SRE_Match object; span=(0, 6), match='這是一段很蛋'>

re.match將嘗試從一開始匹配字符串，因此它會在你的情況下返回None。

來源

2017-05-29 02:38:10 Aaron

非常感謝！這是因爲我嘗試了我的正則表達式[正則表達式測試人員]（https://regex101.com/），現在我知道，非常感謝！ – JiangFeng

可以內Python2做這樣的事情：

編輯：添加具有同樣使用unicode literals將解決這一問題的編碼規範中的一個正確編碼的源文件。

#!/usr/local/bin/python 
# -*- coding: utf-8 -*- 

example = u"這是一段很蛋疼的中文" 
print example.replace(u"這", u"egg") 
# Within Python3 
# print(example.replace("這", 'egg'))

輸出：

egg是一段很蛋疼的中文

來源

2017-05-29 02:35:06

我使用的是Python 3，我發現原因是替換函數並沒有改變原始字符串。非常感謝！ – JiangFeng

如何在Python中替換unicode中文字符？

回答

相關問題