我已經繼承了一些我需要在Python中處理的xml。我正在使用xml.etree.cElementTree
,我在將空元素後面的文本與空元素的標記關聯時遇到了一些問題。這個xml比我下面粘貼的要複雜得多,但我簡化了它,使問題更加清晰(我希望!)。如何將xml文本與Python中前面的空元素相關聯?
我想有其結果是這樣一個字典:
期望結果
{(9, 1): 'As they say, A student has usually three maladies:', (9, 2): 'poverty, itch, and pride.'}
元組還可以包含字符串(例如,('9', '1')
)。我真的不在乎這個早期階段。
這裏是XML:
test1.xml
<div1 type="chapter" num="9">
<p>
<section num="1"/> <!-- The empty element -->
As they say, A student has usually three maladies: <!-- Here lies the trouble -->
<section num="2"/> <!-- Another empty element -->
poverty, itch, and pride.
</p>
</div1>
我曾嘗試
嘗試1
>>> import xml.etree.cElementTree as ET
>>> tree = ET.parse('test1.xml')
>>> root = tree.getroot()
>>> chapter = root.attrib['num']
>>> d = dict()
>>> for p in root:
for section in p:
d[(int(chapter), int(section.attrib['num']))] = section.text
>>> d
{(9, 2): None, (9, 1): None} # This of course makes sense, since the elements are empty
嘗試2
>>> for p in root:
for section, text in zip(p, p.itertext()): # unfortunately, p and p.itertext() are two different lengths, which also makes sense
d[(int(chapter), int(section.attrib['num']))] = text.strip()
>>> d
{(9, 2): 'As they say, A student has usually three maladies:', (9, 1): ''}
正如你可以在後面的嘗試看,p
和p.itertext()
是兩個不同的長度。 (9, 2)
的值是我試圖與關鍵字(9, 1)
關聯的值,而我想與(9, 2)
關聯的值甚至沒有出現在d
中(因爲zip
截斷了較長的p.itertext()
)。
任何幫助,將不勝感激。提前致謝。
輝煌。像魅力一樣工作。謝謝。 – user3079064