如何轉換Python中的轉義字符？

我想包含轉義字符字符串轉換爲正常的形式，以同樣的方式Python的詞法分析器的作用：如何轉換Python中的轉義字符？

>>> escaped_str = 'One \\\'example\\\'' 
>>> print(escaped_str) 
One \'Example\' 
>>> normal_str = normalize_str(escaped_str) 
>>> print(normal_str) 
One 'Example'

當然枯燥的方式將取代所有已知的轉義字符一個接一個： http://docs.python.org/reference/lexical_analysis.html#string-literals

你會如何在上面的代碼中實現normalize_str()？

來源

2011-07-29 aligf

'r'raw字符串」 ' – JBernardo

這裏有什麼問題？ –

 
>>> escaped_str = 'One \\\'example\\\'' 
>>> print escaped_str.encode('string_escape') 
One \\\'example\\\' 
>>> print escaped_str.decode('string_escape') 
One 'example'

幾個類似的編解碼器available，如ROT13和十六進制。

以上是Python 2.x，但是 - 既然您說過（在下面，在評論中）您使用的是Python 3.x - 儘管解碼一個Unicode字符串對象是環保的，但它是still possible。該編解碼器已更名爲「unicode_escape」太：

 
Python 3.3a0 (default:b6aafb20e5f5, Jul 29 2011, 05:34:11) 
[GCC 4.4.3] on linux2 
Type "help", "copyright", "credits" or "license" for more information. 
>>> escaped_str = "One \\\'example\\\'" 
>>> import codecs 
>>> print(codecs.getdecoder("unicode_escape")(escaped_str)[0]) 
One 'example'

來源

2011-07-29 02:23:57

一個好回合值得另一:)我曾經發現，我可以通過編寫我自己的字符串編解碼器，FWIW優雅地解決問題。 –

這種方法在Python 3中似乎不起作用。我得到：AttributeError：'str'對象沒有屬性'decode'。 python 3中的 – aligf

，'str'是'bytes'，'unicode'是'str'。你可能需要首先'編碼'爲utf8或ascii（以獲取字節），然後從'string_escape'解碼 – SingleNegationElimination

未配對的反斜槓只是表示的工件，並不實際存儲在內部。如果試圖手動執行此操作，可能會導致錯誤。

如果你唯一的興趣是消除不反斜槓的奇量居前的反斜槓，你可以嘗試while循環：

escaped_str = 'One \\\'example\\\'' 
chars = [] 
i = 0 
while i < len(escaped_str): 
    if i == '\\': 
     chars.append(escaped_str[i+1]) 
     i += 2 
    else: 
     chars.append(escaped_str[i]) 
     i += 1 
fixed_str = ''.join(chars) 
print fixed_str

檢查你的變量之後，你就會明白爲什麼你想要什麼做沒有意義。

......但在旁註中，我幾乎100％確定「與Python的詞法分析器相同的方式」，它不使用分析器，可以這麼說。解析器用於語法，它描述了將單詞合在一起的方式。

您可能正在考慮進行詞彙內容驗證，這通常是使用正則表達式指定的。解析器是一個完全更具挑戰性和更強大的野獸，而不是爲了線性字符串操作的目的而想要搞亂的東西。

來源

2011-07-29 01:26:43

OP稱爲「詞法分析器」的東西可能更準確地被稱爲**詞法分析器**，而Python確實有。幸運的是，我們不必重新發明它;它有一些細節反映 - 請參閱我的答案。 –

我認爲真正的問題是：

I have a string that is formatted as if it were a part of Python source code. How can I safely interpret it so that \n within the string is transformed into a newline, quotation marks are expected on either end, etc. ?

嘗試ast.literal_eval。

>>> import ast 
>>> print ast.literal_eval(raw_input()) 
"hi, mom.\n This is a \"weird\" string, isn't it?" 
hi, mom. 
This is a "weird" string, isn't it?

爲了比較，走另一條路：

>>> print repr(raw_input()) 
"hi, mom.\n This is a \"weird\" string, isn't it?" 
'"hi, mom.\\n This is a \\"weird\\" string, isn\'t it?"'

來源

2011-07-29 02:03:43

literal_eval需要有效的字符串文字，包括開始/結束引號。添加引號（問題中的示例沒有它們）有幾個邊緣情況，具體取決於您想要接受的輸入類型。 –

@Fred非常真實;但我認爲，在大多數情況下，這確實是您想要解決的問題，即使OP將其排除在示例之外，開始/結束引用實際上也存在。 :) –

我不確定是否真的是你一直想要解決的問題：我猜想string_escape編解碼器（如我的答案）存在以滿足轉換轉義的實際需要，而不需要字符串文字。（指出literal_eval仍然有用，雖然;我是upvote。;） –

SingleNegationElimination已經提到這一點，但這裏有一個例子：

在Python 3：

>>>escaped_str = 'One \\\'example\\\'' 
>>>print(escaped_str.encode('ascii', 'ignore').decode('unicode_escape')) 
One 'example'

來源

2017-10-13 19:51:40 Attaque

如何轉換Python中的轉義字符？

回答

相關問題