2011-02-01 97 views
1

我有一個xml文件。 它的樣子,使用元素樹讀取xml文件

<root> 
    <Group>  
    <ChapterNo>1</ChapterNo>  
    <ChapterName>A</ChapterName>  
    <Line>1</Line>  
    <Content>zfsdfsdf</Content>  
    <Synonyms>fdgd</Synonyms>  
    <Translation>assdfsdfsdf</Translation>  
    </Group>  
    <Group>  
    <ChapterNo>1</ChapterNo>  
    <ChapterName>A</ChapterName>  
    <Line>2</Line>  
    <Content>ertreter</Content>  
    <Synonyms>retreter</Synonyms>  
    <Translation>erterte</Translation>  
    </Group>  
    <Group>  
    <ChapterNo>2</ChapterNo>  
    <ChapterName>B</ChapterName>  
    <Line>1</Line>  
    <Content>sadsafs</Content> 
    <Synonyms>sdfsdfsd</Synonyms> 
    <Translation>sdfsdfsd</Translation> 
    </Group> 
    <Group> 
    <ChapterNo>2</ChapterNo> 
    <ChapterName>B</ChapterName> 
    <Line>2</Line> 
    <Content>retete</Content> 
    <Synonyms>retertret</Synonyms> 
    <Translation>retertert</Translation> 
    </Group> 
</root> 

我這樣試過.......

root = ElementTree.parse('data.xml').getroot() 
ChapterNo = root.find('ChapterNo').text 
ChapterName = root.find('ChapterName').text 
GitaLine = root.find('Line').text 
Content = root.find('Content').text 
Synonyms = root.find('Synonyms').text 
Translation = root.find('Translation').text 

但它顯示了一個錯誤

ChapterNo=root.find('ChapterNo').text 
AttributeError: 'NoneType' object has no attribute 'text'` 

現在我想要得到的一切ChapterNo,ChapterName等分別使用元素樹,我想將這些數據插入數據庫....任何人都可以幫助我?

RGDS,

Nimmy

+0

我試過......... root = ElementTree.parse('data.xml')。getroot() ChapterNo = root.find('ChapterNo')。text ChapterName = root.find( 'ChapterName')。text GitaLine = root.find('Line')。text Content = root.find('Content')。text 同義詞= root.find('Synonyms')。text Translation = root。文本 AttributeError:'NoneType'對象沒有屬性'text'「 – Nimmy 2011-02-01 10:02:19

+0

將其添加到您的問題中,其'hard'('Translation')。但是顯示錯誤」ChapterNo = root.find('ChapterNo'閱讀評論。 – 2011-02-01 10:03:11

回答

1

ChapterNo不是root直接孩子,所以root.find('ChapterNo')將無法​​正常工作。您將需要使用xpath語法來查找數據。

此外,還有多次出現ChapterNo,ChapterName等,因此您應該使用findall並遍歷結果以獲取每個文本。

chapter_nos = [e.text for e in root.findall('.//ChapterNo')] 

等等。

0

下面是一個小例子,使用sqlalchemy來定義一個對象,該對象將提取數據並將其存儲在sqlite數據庫中。

from sqlalchemy import create_engine, Unicode, Integer, Column, UnicodeText 
from sqlalchemy.orm import create_session 
from sqlalchemy.ext.declarative import declarative_base 

engine = create_engine('sqlite:///chapters.sqlite', echo=True) 
Base = declarative_base(bind=engine) 

class ChapterLine(Base): 
    __tablename__ = 'chapterlines' 
    chapter_no = Column(Integer, primary_key=True) 
    chapter_name = Column(Unicode(200)) 
    line = Column(Integer, primary_key=True) 
    content = Column(UnicodeText) 
    synonyms = Column(UnicodeText) 
    translation = Column(UnicodeText) 

    @classmethod 
    def from_xmlgroup(cls, element): 
     l = cls() 
     l.chapter_no = int(element.find('ChapterNo').text) 
     l.chapter_name = element.find('ChapterName').text 
     l.line = int(element.find('Line').text) 
     l.content = element.find('Content').text 
     l.synonyms = element.find('Synonyms').text 
     l.translation = element.find('Translation').text 
     return l 

Base.metadata.create_all() # creates the table 

下面是如何使用它:

from xml.etree import ElementTree as etree 

session = create_session(bind=engine, autocommit=False) 
doc = etree.parse('myfile.xml').getroot() 
for group in doc.findall('Group'): 
    l = ChapterLine.from_xmlgroup(group) 
    session.add(l) 

session.commit() 

我已經在你的XML數據測試此代碼,它工作正常,一切都插入到數據庫中。

2

解析您簡單的兩層次的數據結構和組裝爲每個組的字典,所有你需要做的是這樣的:

>>> # what you did to get `root` 
>>> from pprint import pprint as pp 
>>> for group in root: 
...  d = {} 
...  for elem in group: 
...   d[elem.tag] = elem.text 
...  pp(d) # or whack it ito a database 
... 
{'ChapterName': 'A', 
'ChapterNo': '1', 
'Content': 'zfsdfsdf', 
'Line': '1', 
'Synonyms': 'fdgd', 
'Translation': 'assdfsdfsdf'} 
{'ChapterName': 'A', 
'ChapterNo': '1', 
'Content': 'ertreter', 
'Line': '2', 
'Synonyms': 'retreter', 
'Translation': 'erterte'} 
{'ChapterName': 'B', 
'ChapterNo': '2', 
'Content': 'sadsafs', 
'Line': '1', 
'Synonyms': 'sdfsdfsd', 
'Translation': 'sdfsdfsd'} 
{'ChapterName': 'B', 
'ChapterNo': '2', 
'Content': 'retete', 
'Line': '2', 
'Synonyms': 'retertret', 
'Translation': 'retertert'} 
>>> 

看,麻,沒有的XPath!