如何使用scrapy

選擇下一個節點我的HTML看起來像這樣：如何使用scrapy

<h1>Text 1</h1> 
<div>Some info</div> 
<h1>Text 2</h1> 
<div>...</div>

我明白瞭如何使用從H1 scrapy信息提取：

content.select("//h1[contains(text(),'Text 1')]/text()").extract()

但我的目標是從提取內容<div>Some info</div>

我的問題是，我沒有任何有關div的具體信息。所有我知道的，它完全在<h1>Text 1</h1>之後。我可以使用選擇器在樹中獲取NEXT元素嗎？元素，位於DOM樹中的相同級別？

喜歡的東西：

a = content.select("//h1[contains(text(),'Text 1')]/text()") 
a.next("//div/text()").extract() 
Some info

2013-11-04 SkyFox

試試這個xpath：

//h1[contains(text(), 'Text 1')]/following-sibling::div[1]/text()

2013-11-04 13:09:36 kev

回答