Python XML BeautifulSoup獲取子節點的文本

-1

以前的項目中，我從XML標籤屬性中獲取數據，但是我無法弄清楚如何獲取子節點XML文本。該程序從文本文件中提取id並將其插入到一個url中，然後解析該url。該XML如下：Python XML BeautifulSoup獲取子節點的文本

<Article> 
    <Sometag Owner="Steve" Status="online"> 
     <ID Version="1">231119634</PMID> 
     <DateCreated> 
      <Year>2012</Year> 
      <Month>10</Month> 
      <Day>10</Day> 
     </DateCreated>

我想要得到的yearmonth和day文本出來的DateCreated

到目前爲止，兒童的標籤，我有以下的，沒有運氣

link = "http://somelink.com/"+line.rstrip('\n')+"?id=xml&format=text" 
    args = (curlLink + ' -L ' + link + ' -o c:\\temp.txt --proxy-ntlm -x http://myproxy:80 -k -U:') 
    sp = subprocess.Popen(args) #run curl 
    sp.wait() #Wait for it to finish before proceeding 
    xml_string = open(r'C:\temp.txt', 'r').read() #read in the temporary file 
    os.remove(r'C:\temp.txt') # clean up 
    soup = BeautifulSoup(xml_string) 
    result = soup.find('DateCreated') 
    if result is not None: 
     date = result.children.get_text() 
     g.write(date +"\n")

來源

2013-07-16 sdweldon

生病感謝downvote無故 – sdweldon

有幾種不同的方式可以從數據中獲取信息：

year = int(date.Year.text) 
month = int(date.Month.text) 
day = int(date.Day.text)

或date.text爲您提供文本內容作爲字符串。你應該使用什麼取決於你真正需要什麼。

來源

2013-07-16 20:13:15 mata

Python XML BeautifulSoup獲取子節點的文本

回答

相關問題