在python 2.7中將錯誤編碼爲unicode字符串？

我想在Python 2.7中打印一個字符串的unicode版本。它工作正常，在Python 3 但與Python 2.7，我得到以下錯誤：在python 2.7中將錯誤編碼爲unicode字符串？

x="strings are now utf-8 \u03BCnico\u0394é!"

的Python 3：

print('Python', python_version()) 
print(x) 

Python 3.4.1 
strings are now utf-8 μnicoΔé!

的Python 2.7

>>> x='strings are now utf-8 \u03BCnico\u0394é!' 
>>> x.encode('utf-8') 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 38: ordinal not in range(128)

編輯： 我trie d的followimg：

>>> x = u'strings are now utf-8 \\u03BCnico\\u0394\xc3\xa9!' 
>>> x 
u'strings are now utf-8 \\u03BCnico\\u0394\xc3\xa9!' 
>>> x.encode("utf-8") 
'strings are now utf-8 \\u03BCnico\\u0394\xc3\x83\xc2\xa9!' 
>>> x 
u'strings are now utf-8 \\u03BCnico\\u0394\xc3\xa9!'

我沒有看到編碼發生

編輯2：

>>> x=u'strings are now utf-8 \u03BCnico\u0394é!' 
>>> x.encode("utf-8") 
'strings are now utf-8 \xce\xbcnico\xce\x94\xc3\xa9!' 
>>> b=x.encode("utf-8") 
>>> b 
'strings are now utf-8 \xce\xbcnico\xce\x94\xc3\xa9!' 
>>>

來源

2014-07-14 eagertoLearn

你的第一個問題是你試圖編碼一個字節字符串。您**將字節串**解碼爲Unicode，並且**以特定編碼（例如'utf-8'）將Unicode Unicode編碼成字節串。 –

只要嘗試打印沒有''''.encode（）'''的unicode文字''print x'''。 – wwii

你的第二個問題是你試圖在一個字節字符串中使用unicode轉義序列（'\ u ...'） - 它們只能工作在unicode文字中，如@ LyndsySimon的答案中所示。 –

在Python 2.x中，你需要使用unicode literal:

x=u"strings are now utf-8 \u03BCnico\u0394é!"

沒有這個，encode方法不知道該字符串是什麼編碼，並且屁股它是ASCII的。然後它嘗試將ASCII轉換爲UTF-8，並在遇到ASCII字符集之外的字符時失敗。

還要注意Python 3.3及以上版本支持這種表示法。在這種情況下，它基本上是不可操作的，因爲所有字符串都假定爲unicode，但允許開發人員編寫與2.x和3.3+兼容的代碼。

來源

2014-07-14 18:06:41

請參閱上面的編輯： – eagertoLearn

我回顧了關於您的OP的評論 - 您是否仍然需要我查看您的更改？ –

我得到它的工作。謝謝 – eagertoLearn

在python 2.7中將錯誤編碼爲unicode字符串？

回答

相關問題