2013-07-16 45 views
-1

以前的項目中,我從XML標籤屬性中獲取數據,但是我無法弄清楚如何獲取子節點XML文本。該程序從文本文件中提取id並將其插入到一個url中,然後解析該url。該XML如下:Python XML BeautifulSoup獲取子節點的文本

<Article> 
    <Sometag Owner="Steve" Status="online"> 
     <ID Version="1">231119634</PMID> 
     <DateCreated> 
      <Year>2012</Year> 
      <Month>10</Month> 
      <Day>10</Day> 
     </DateCreated> 

我想要得到的yearmonthday文本出來的DateCreated

到目前爲止,兒童的標籤,我有以下的,沒有運氣

link = "http://somelink.com/"+line.rstrip('\n')+"?id=xml&format=text" 
    args = (curlLink + ' -L ' + link + ' -o c:\\temp.txt --proxy-ntlm -x http://myproxy:80 -k -U:') 
    sp = subprocess.Popen(args) #run curl 
    sp.wait() #Wait for it to finish before proceeding 
    xml_string = open(r'C:\temp.txt', 'r').read() #read in the temporary file 
    os.remove(r'C:\temp.txt') # clean up 
    soup = BeautifulSoup(xml_string) 
    result = soup.find('DateCreated') 
    if result is not None: 
     date = result.children.get_text() 
     g.write(date +"\n") 
+0

生病感謝downvote無故 – sdweldon

回答

1

有幾種不同的方式可以從數據中獲取信息:

year = int(date.Year.text) 
month = int(date.Month.text) 
day = int(date.Day.text) 

date.text爲您提供文本內容作爲字符串。你應該使用什麼取決於你真正需要什麼。