If-聲明在scrapy中不起作用

我已經使用scrapy構建了一個爬網程序來抓取到站點地圖，並從站點地圖中的所有鏈接中抓取所需的組件。If-聲明在scrapy中不起作用

class MySpider(SitemapSpider): 
name = "functie" 
allowed_domains = ["xyz.nl"] 
sitemap_urls = ["http://www.xyz.nl/sitemap.xml"] 

def parse(self, response): 
    item = MyItem() 
    sel = Selector(response) 

    item['url'] = response.url 
    item['h1'] = sel.xpath("//h1[@class='no-bd']/text()").extract() 
    item['jobtype'] = sel.xpath('//input[@name=".Keyword"]/@value').extract() 
    item['count'] = sel.xpath('//input[@name="Count"]/@value').extract() 
    item['location'] = sel.xpath('//input[@name="Location"]/@value').extract() 
    yield item

的項[ '位置']可以在某些情況下，空值。在這種特殊情況下，我想刮其他組件，並將其存儲在項['location']。我已經試過代碼：

item['location'] = sel.xpath('//input[@name="Location"]/@value').extract() 
if not item['location']: 
item['location'] = sel.xpath('//a[@class="location"]/text()').extract()

但是它沒有檢查如果條件，如果值是在位置輸入字段爲空返回空。任何幫助將非常有用。

來源

2014-04-14 sulav_lfc

你確定條件不叫，還是也許是第二個'sel.xpath'也返回一個「空」的價值？您是否通過將例如那裏有一個打印語句？另外，那個「空值」究竟是什麼？ –

'.extract（）'返回一個列表。具有單個空字符串的列表被評估爲「真」 – warvariuc

您也許希望檢查item['location']的長度。

item['location'] = sel.xpath('//input[@name="Location"]/@value').extract() 
if len(item['location']) < 1: 
    item['location'] = sel.xpath(//a[@class="location"]/text()').extract()')

無論如何，你有沒有考慮將兩個xpath與|相結合？

item['location'] = sel.xpath('//input[@name="Location"]/@value | //a[@class="location"]/text()').extract()'

來源

2014-04-14 14:49:29 ScrapyNovice

試試這個辦法：

if(item[location]==""): 
    item['location'] = sel.xpath('//a[@class="location"]/text()').extract()

來源

2014-04-15 08:00:54 mrki

If-聲明在scrapy中不起作用

回答

相關問題