從Scrapy中的xpath選擇器中刪除跨度div

我需要在下面的html中只提取19.10值，但是我的xpath不起作用。提前致謝。從Scrapy中的xpath選擇器中刪除跨度div

<div class="class1">19.10 
     <span class="class2"><br></span> 
</div>

的Xpath：

//div[@class='class1'][not(preceding::span[@class='class2'])]/text()

結果：

[u'19.10\n\t\t\t\t\t\t\t', u'\n\t\t\t\t\t\t']

來源

2016-08-08 DevOps

你想在這裏拿到第一個文本元素。這樣做的方法很少。使用XPath：

"/div[@class='class1'][not(preceding::span[@class='class2'])]/text()[1]"

或用後處理：

# just first element 
response.xpath("xpath").extract_first()

，或者如果你熟悉項目裝載機：

from scrapy.loader.processors import TakeFirst 
from scrapy.loader import ItemLoader 
class MyItemLoader(ItemLoader): 
    myfield_out = TakeFirst() 
ml = MyItemLoader() 
ml.add_xpath('myfield', 'xpath')

來源

2016-08-08 06:04:01 Granitosaurus

它的工作非常感謝。 – DevOps

嘗試下面的XPath： -

string(//div[@class='class1'])

或

(//div[@class='class1']/text())[1]

來源

2016-08-08 06:06:04

它的工作非常感謝。 – DevOps

從Scrapy中的xpath選擇器中刪除跨度div

回答

相關問題