Scrapy：如何提取嵌套div（xpath選擇器）中的內容？

請參閱下面的html標記。我如何使用Scrapy中的xpath選擇器從div中的類名提取col-sm-7類的內容？Scrapy：如何提取嵌套div（xpath選擇器）中的內容？

我想提取這樣的文字：

的Infortrend的EonNAS Pro的850X 8盤位塔式NAS與萬兆以太網

HTML：

<div class="pricing panel panel-primary"> 
    <div class="panel-heading">Infortrend Products</div> 
    <div class="body"> 
    <div class="panel-subheading"><strong>EonNAS Pro Models</strong></div> 
    <div class="row"> 
    <div class="col-sm-7"><strong>Infortrend EonNAS Pro 850X 8-bay Tower NAS with 10GbE</strong><br /> 
     <small>Intel Core i3 Dual-Core 3.3GHz Processor, 8GB DDR3 RAM (Drives Not Included)</small></div> 
    <div class="col-sm-3">#ENP8502MD-0030<br /> 
     <strong> Our Price: $2,873.00</strong></div> 
    <div class="col-sm-2"> 
     <form action="/addcart.asp" method="get"> 
     <input type="hidden" name="item" value="ENP8502MD-0030 - Infortrend EonNAS Pro 850X 8-bay Tower NAS with 10GbE (Drives Not Included)"> 
     <input type="hidden" name="price" value="$2873.00"> 
     <input type="hidden" name="custID" value=""> 
     <input type="hidden" name="quantity" value="1"> 
     <button type="submit" class="btn btn-primary center-block"><i class="fa fa-shopping-cart"></i> Add to Cart</button> 
     </form> 
    </div> 
    </div> 
    </div> 
    </div>

我試圖用這個命令但它不起作用：

response.xpath('//div[@class="pricing panel panel-primary"]/div[@class="panel-heading"]/text()/div[@class="body"]//div[@class="panel-subheading" and contains(@style,'font-weight:bold')]/text()').extract_first()

來源

2017-04-02 Vy Nguyen

試試這個xpath表達式// // div [@clas S = 「COL-SM-7」] /強/文本（）' – vold

試試這個：

response.xpath('//*[@class="col-sm-7"]//strong//text()').extract()

希望它能幫助:)

來源

2017-04-02 20:41:26

您可以<strong>元素，這樣的事情之間擷取文字：

print(response.xpath('//div[@class="col-sm-7"]//text()').extract()[0].strip())

或

print(response.xpath('//div[@class="col-sm-7"]/strong/text()').extract()[0].strip())

以上陳述將導致：

elem_text = ' '.join([txt.strip() for txt in response.xpath('//div[@class="col-sm-7"]//text()').extract()]) 
print(elem_text)

此：

Infortrend EonNAS Pro 850X 8-bay Tower NAS with 10GbE

你可以在這裏面所有元素之間獲取文本與包括內<strong>//text()和<small>標籤裏面的元素，像這樣的div將導致：

Infortrend EonNAS Pro 850X 8-bay Tower NAS with 10GbE Intel Core i3 Dual-Core 3.3GHz Processor, 8GB DDR3 RAM (Drives Not Included)

來源

2017-04-02 20:44:09

Scrapy：如何提取嵌套div（xpath選擇器）中的內容？

回答

相關問題