編碼問題在Python

讀取文件時，我有一個包含編碼問題在Python

foo = "Gro\xdfbritannien"

我使用下面的一個文件，但它總是與\ X

import codecs 
    f = codecs.open('myfile', 'r', 'utf8') 
    for line in f: 
     print line 
     print line.encode('utf-8') 
     print line.decode('utf-8')

我不能顯示原文t看到如何顯示正確的編碼文本，因爲當我在做

>>> print u'Gro\xdfbritannien' 
    Großbritannien

任何提示將不勝感激！

來源

2014-02-13 apassant

如果你的文件字面上有引號的字符串用一個反斜槓和它的'x'，你需要解析字符串字面量像'decode（'string-escape'）'。 – user2357112

當你的文件中包含的行

foo = "Gro\xdfbritannien"

它包含一個實際的反斜槓字符，然後x，d和f。因此，如果該行被讀取到一個Python字符串，它的讀取

'foo = "Gro\\xdfbritannien"'

（和因爲這些都是ASCII字符，如果你與utf-8編解碼器打開與否並不重要）。

所以你需要先用string_escape編解碼器將其解碼：

>>> foo.decode("string_escape") 
'Gro\xdfbritannien'

和然後它解碼到正確的Unicode對象

>>> _.decode("latin1") 
u'Gro\xdfbritannien'

然後你就可以打印

>>> print _ 
Großbritannien

來源

2014-02-13 09:12:53

謝謝 - 完美的與print line.decode（「string_escape」）。decode（「latin1」） – apassant

-1

沒有編解碼器的業務。你應該做這樣的「富=‘格羅\ xdfbritannien’」

>>> print u'Gro\\xdfbritannien' 
Gro\xdfbritannien

來源

2014-02-13 09:20:44 UnZike

編碼問題在Python

回答

相關問題