如何使用XPath選擇

下面的兄弟標籤我有一個HTML文件中像這樣：如何使用XPath選擇

<div id="note"> 
<a name="overview"></a> 
<h3>Overview</h3> 
<p>some text1...</p> 
<a name="description"></a> 
<h3>Description</h3> 
<p>some text2 ...</p> 
</div> 
           `

我想找回段落，每個標題。例如，overview：some text1 description：some text 2 ... 我想用python在xpath中編寫這個。謝謝。

來源

2013-07-31 Gomeisa

找到所有h3標籤，在它們之間迭代，並在迭代循環的每一個步驟，找到一個兄弟標籤p：

import urllib2 
from lxml import etree 

URL = "http://www.kb.cert.org/vuls/id/628463" 
response = urllib2.urlopen(URL) 

parser = etree.HTMLParser() 
tree = etree.parse(response, parser) 

for header in tree.iter('h3'): 
    paragraph = header.xpath('(.//following-sibling::p)[1]') 
    if paragraph: 
     print "%s: %s" % (header.text, paragraph[0].text)

打印：

Overview: The Ruby on Rails 3.0 and 2.3 JSON parser contain a vulnerability that may result in arbitrary code execution. 
Description: Thanks to Lawrence Pit of Mirror42 for discovering the vulnerability. 
Impact: Thanks to Lawrence Pit of Mirror42 for discovering the vulnerability. 
Solution: Thanks to Lawrence Pit of Mirror42 for discovering the vulnerability. 
Vendor Information : Thanks to Lawrence Pit of Mirror42 for discovering the vulnerability. 
CVSS Metrics : Thanks to Lawrence Pit of Mirror42 for discovering the vulnerability. 
References: Thanks to Lawrence Pit of Mirror42 for discovering the vulnerability. 
Credit: Thanks to Lawrence Pit of Mirror42 for discovering the vulnerability. 
Feedback: If you have feedback, comments, or additional information about this vulnerability, please send us 
Subscribe to Updates: Receive security alerts, tips, and other updates.

來源

2013-07-31 06:39:25 alecxe

感謝您的回覆。我得到這個錯誤： lxml.etree.XMLSyntaxError：從LXML進口etree 進口的urllib 從StringIO的進口StringIO的：打開和結束標記不匹配：鏈接，此行代碼1和頭部，1號線，列485 url ='http://www.kb.cert.org/vuls/id/628463' text = urllib.urlopen（url）.read（） f = StringIO（text） tree = etree.parse（f ） headers = tree.xpath（'// h3'） for header in header： paragraph = header.xpath（'（.// following-sibling :: p）[1]'）[0] print 「％s：％s」％（header.text，paragraph.text） p.s.我是新的python和xpath。 – Gomeisa

@Golbarghajian我更新了代碼，請檢查。 – alecxe

它的工作，謝謝sooo多。 – Gomeisa

如何使用XPath選擇

回答

相關問題