scrapy返回第一項

我正在學習scrapy，因爲某些原因它只返回頁面上的第一項。有人能告訴我我做錯了什麼嗎？scrapy返回第一項

下面是我的代碼：

class RuvillaSpider(Spider): 

    name = "RuvillaSpider" 
    allowded_domains = ["ruvilla.com"] 
    start_urls = ["https://www.ruvilla.com/men/footwear.html?dir=desc&limit=45&order=news_from_date"] 

    def parse(self, response): 
     products = Selector(response).xpath('//div[@class="category-products"]') 

     if not products: 
      raise CloseSpider('RuvillaSpider: DONE, NO MORE PAGES.') 

     for product in products: 
      item = RuvillaItem() 
      item['name'] = product.xpath('ul/li/div/div[1]/a/@title').extract()[0] 
      item['link'] = product.xpath('ul/li/div/div[1]/a/@href').extract()[0] 
      item['image'] = product.xpath('ul/li/div/div[1]/a/img/@src').extract()[0] 
      yield item

來源

2017-02-27 user3737709

這裏，除了定義你的類，你是*先驗*沒有任何錯誤。你對這個類的實例做什麼也可能是有幫助的。例如，這可能會告訴我們您意識到您正在使用發電機。 – Kanak

你的XPath似乎只返回1個產品爲products變量。

嘗試：

$ scrapy shell "https://www.ruvilla.com/men/footwear.html?dir=desc&limit=45&order=news_from_date" 
In[1]: response.xpath('//div[@class="category-products"]') 
Out[1]: [<Selector xpath='//div[@class="category-products"]' data=u'<div class="category-products">\n<div cla'>]

如此看來你的XPath是不是每一個人項目，但對於容器中的物品都在爲了解決這個問題，你需要生成一個選擇每產品容器而不是一個XPath ：

def parse(self, response): 
    products = Selector(response).xpath('//div[@class="category-products"]//li[contains(@class,"item")]') 

    for product in products: 
     item = dict() 
     item['name'] = product.xpath('.//a/@title').extract_first() 
     item['link'] = product.xpath('.//a/@href').extract_first() 
     item['image'] = product.xpath('.//a/img/@src').extract_first() 
     yield item 
    next_page = response.xpath("//li[@class='current']/following-sibling::li[1]/a/@href").extract_first() 
    if next_page: 
     yield Request(next_page)

來源

2017-02-27 04:24:24 Granitosaurus

你的xpath是錯誤的。

使用此XPath：

（ '// DIV [@類= 「類產品」]/UL /李'）

來源

2017-02-27 05:35:38 bbanzzakji

scrapy返回第一項

回答

相關問題