str.encode期望輸入什麼內容？

我希望在我的項目中使用unicode而不是str作爲所有字符串。我正在嘗試使用str.encode方法，但無法從文檔中瞭解encode方法確切地做什麼或期望作爲輸入。str.encode期望輸入什麼內容？

希臘小寫字母pi是U + 03C0，當用UTF-8編碼時是0xCF 0x80。我得到如下：

>>> s1 = '\xcf\x80' 
>>> s1.encode('utf-8','ignore') 

Traceback (most recent call last): 
    File "<pyshell#61>", line 1, in <module> 
    s1.encode('utf-8','ignore') 
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcf in position 0: ordinal not in range(128)

我試了：

>>> s2='\x03\xc0' 

>>> s2.encode('utf-8','ignore') 

Traceback (most recent call last): 
    File "<pyshell#62>", line 1, in <module> 
    s2.encode('utf-8','ignore') 
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc0 in position 1: ordinal not in range(128)

是什麼encode期望作爲輸入，以及爲何「忽略」選項不可忽視的錯誤？我嘗試'替換'，也沒有掩蓋錯誤。

來源

2015-01-02 Old Geezer

在Python 2.x中，str是一個字節字符串（編碼）。您可以將其解碼爲unicode對象：

>>> s1 = '\xcf\x80' # string literal (str) 
>>> s1.decode('utf-8') 
u'\u03c0'

對Unicode的對象，你可以做編碼：

>>> u1 = u'\u03c0' # unicode literal (unicode) U+03C0 
>>> u1.encode('utf-8') 
'\xcf\x80'

來源

2015-01-02 05:07:42 falsetru

str.encode期望輸入什麼內容？

回答

相關問題