的Python 「string_escape」與「unicode_escape」

According to the docs，內建的字符串編碼string_escape：的Python 「string_escape」與「unicode_escape」

農產品[s]的一個字符串，它是適合作爲字符串中Python源代碼

字面...而unicode_escape：

農產品[s]的一個字符串，它是適合以Unicode字面在Python源代碼

所以，他們應該有大致相同的行爲。但是，他們似乎區別對待單引號：

>>> print """before '" \0 after""".encode('string-escape') 
before \'" \x00 after 
>>> print """before '" \0 after""".encode('unicode-escape') 
before '" \x00 after

的string_escape逃逸單引號，而Unicode的一個沒有。是否安全地假設我可以簡單地：

>>> escaped = my_string.encode('unicode-escape').replace("'", "\\'")

...並獲得預期的行爲？

編輯：只是要非常清楚，預期的行爲是得到適合作爲文字的東西。

來源

2010-06-03 Mike Boers

根據我對0123y和unyode repr在CPython 2.6.5源碼的執行解釋，是的; repr(unicode_string)和unicode_string.encode('unicode-escape')之間的唯一區別是包裝報價和轉義使用任何報價。

它們都由相同的功能驅動，unicodeescape_string。這個函數接受一個參數，其唯一的功能是切換添加包裝報價和轉義報價。

來源

2010-06-08 23:06:46

這是一些unicode錯誤「不支持Unicode轉義序列」的最清晰答案，它甚至在2016年有效！謝謝！ – dotslash 2016-07-09 10:23:49

在0≤c < 128範圍內，是的'是CPython 2.6的唯一區別。

>>> set(unichr(c).encode('unicode_escape') for c in range(128)) - set(chr(c).encode('string_escape') for c in range(128)) 
set(["'"])

在此範圍之外，兩種類型不能交換。

>>> '\x80'.encode('string_escape') 
'\\x80' 
>>> '\x80'.encode('unicode_escape') 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
UnicodeDecodeError: 'ascii' codec can’t decode byte 0x80 in position 0: ordinal not in range(128) 

>>> u'1'.encode('unicode_escape') 
'1' 
>>> u'1'.encode('string_escape') 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
TypeError: escape_encode() argument 1 must be str, not unicode

Python的3.x中，該string_escape編碼不再存在，因爲str只能存儲Unicode。

來源

2010-06-03 19:32:13 kennytm

這是因爲'\ x80'不是有效的ascii編碼字符串。嘗試使用'u'\ x80'.encode（'unicode-escape'）'，你就會得到''\\ x80'' – 2010-06-03 19:58:28

@Mike：但是你的'my_string'是'str'還是'unicode'？ – kennytm 2010-06-03 20:03:07

的Python 「string_escape」 與 「unicode_escape」

回答

相關問題

的Python 「string_escape」與「unicode_escape」