在Python中使用xml.etree編寫包含歐元符號（€）的xml文件

我試圖使用xml.etree讀取和編寫包含€符號的xml文件。在Python中使用xml.etree編寫包含歐元符號（€）的xml文件

我的簡化代碼如下所示：

optionsdirectory = os.getcwd() 
optionsfile = os.path.join(optionsdirectory, "conf") 
optionstree = ET.parse(optionsfile) 
options = optionstree.getroot() 
for option in options: 
    if option.tag == "currency": 
     option.text = "€" 
optionstree.write(optionsfile, encoding="UTF-8")

我得到，當它運行以下錯誤：

File "C:\curr.py", line 8 
    optionstree.write(optionsfile, encoding="UTF-8") 
File "C:\Python27\lib\xml\etree\ElementTree.py", line 815, in write 
    serialize(write, self._root, encoding, qnames, namespaces) 
File "C:\Python27\lib\xml\etree\ElementTree.py", line 934, in _serialize_xml 
    _serialize_xml(write, e, encoding, qnames, None) 
File "C:\Python27\lib\xml\etree\ElementTree.py", line 932, in _serialize_xml 
    write(_escape_cdata(text, encoding)) 
File "C:\Python27\lib\xml\etree\ElementTree.py", line 1068, in _escape_cdata 
    return text.encode(encoding, "xmlcharrefreplace") 
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 2114: ordinal not in range(128)

是否有寫€符號使用XML XML文件的方式.etree？

來源

2012-09-29 jake

您需要使用unicode文字。這將是更容易使用，而不是性格Unicode轉義：當你不使用Unicode文本，但一個字節（串），而不是字面

option.text = u"\u20AC" # Euro sign

會發生什麼事，是Python的嘗試將值解碼爲使用默認編碼的ASCII碼字面值。這會導致您看到的UnicodeDecodeError。

如果你真的做希望使用轉義字符，請確保您在上面指定的源文件的編碼：

# -*- coding: utf-8 -*-

，並確保您的編輯器使用UTF-8來保存文件。你最好還是要用文字unicode的，但：

option.text = u"€"

來源

2012-09-29 19:39:10

下面介紹如何創建具有非ASCII字符的XML文件。請注意，您需要以通過# coding:聲明的編碼保存源文件，並使用Unicode文字（u'string'）。在下面的例子中我寫的文件爲UTF-8和ASCII兩種證明的ElementTree會在兩種情況下正確地讀取文件：

# coding: utf8 
from xml.etree import ElementTree as et 

# Create the root element. 
root = et.Element('test') 
root.text = u'123€456' 

# Wrap the root in an ElementTree and write files. 
tree = et.ElementTree(root) 
tree.write('utf8.xml',encoding='UTF-8') 
tree.write('ascii.xml',encoding='ascii') 

# Verify that each file can be read correctly. 
tree = et.parse('utf8.xml') 
print tree.getroot().text 
tree = et.parse('ascii.xml') 
print tree.getroot().text 

# display the raw contents of the files 
with open('utf8.xml','rb') as f: 
    print repr(f.read()) 
with open('ascii.xml','rb') as f: 
    print repr(f.read())

注輸出。 0xE2 0x82 0xAC是歐元字符的UTF-8十六進制序列。 €是字符引用。

123€456 
123€456 
"<?xml version='1.0' encoding='UTF-8'?>\n<test>123\xe2\x82\xac456</test>" 
"<?xml version='1.0' encoding='ascii'?>\n<test>123&#8364;456</test>"

來源

2012-09-30 17:08:15

我以爲我是好與網頁，但對我的生活我無法弄清楚如何在本站的「響應」後像其他人那樣。所以我需要創建這個新的答案...

感謝您的迴應馬克託洛寧。你的迴應和Martijn Pieters的迴應都涉及使用Unicode文字。但是這對我不起作用。我正在使用的XML文件將通過寫出包含€符號的文件名創建。我通過以下代碼獲取文件名：

for file in os.listdir(r'C:\test'): 
    filenamelist = filenamelist + " " + file

其中一些文件名在文件名本身中將包含€符號。然後我想寫這些文件名作爲XML特性如下：

optionsdirectory = os.getcwd() 
optionsfile = os.path.join(optionsdirectory, "conf.xml") 
optionstree = ET.parse(optionsfile) 
options = optionstree.getroot() 
for option in options: 
    if option.tag == "filenames": 
     option.text = filenamelist 
optionstree.write(optionsfile, encoding="UTF-8")

起初，我將有一個XML文件，「conf.xml中」，其中將包含一個空的「文件名」屬性。我知道這很爛，但它適用於我的目的。

因此，€符號不能來自Unicode文字。當我運行上面的代碼時，我得到了我在原始文章中發佈的錯誤，它基本上說當它在'文件名列表'中遇到€符號時，它會拋出它的手。

來源

2012-09-30 17:38:20 Jake

嗯，我終於明白了。我做了以下三件事。我認爲它們都是必需的：1）將最初的'conf.xml'文件保存爲UTF-8文件（這使得ET.parse（）能夠工作）2）將'filenamelist = filenamelist +''+ file'到'filenamelist = filenamelist +'「+ file.decode（'ISO-8859-1'）'（這使得文件名稱文本適合寫入）3）將write（）語句改爲'optionstree.write（optionsfile，encoding ='ISO-8859-1'）'（匹配用於文件名的編碼） – Jake

我還應該注意到我使用的是Python 2.7版本。我讀過他們改變了3.x版本的unicode處理的東西;也許我不會在Python 3.x中遇到這些問題 – Jake

在Python中使用xml.etree編寫包含歐元符號（€）的xml文件

回答

相關問題