我是scrapy的新手,到目前爲止我已經能夠創建幾個蜘蛛。我想寫一個抓取Yellowpages的蜘蛛,尋找具有404響應的網站,蜘蛛工作正常,但是,分頁不起作用。任何幫助都感激不盡。在此先感謝需要幫助YellowPages蜘蛛
# -*- coding: utf-8 -*-
import scrapy
class SpiderSpider(scrapy.Spider):
name = 'spider'
#allowed_domains = ['www.yellowpages.com']
start_urls = ['https://www.yellowpages.com/search?search_terms=handyman&geo_location_terms=Miami%2C+FL']
def parse(self, response):
for listing in response.css('div.search-results.organic div.srp-listing'):
url = listing.css('a.track-visit-website::attr(href)').extract_first()
yield scrapy.Request(url=url, callback=self.parse_details)
# follow pagination links
next_page_url = response.css('a.next.ajax-page::attr(href)').extract_first()
next_page_url = response.urljoin(next_page_url)
if next_page_url:
yield scrapy.Request(url=next_page_url, callback=self.parse)
def parse_details(self,response):
yield{'Response': response,}
嗨大衛,這是我在這裏的第一次發帖,我是有格式的代碼問題。我的問題很簡單我有這個蜘蛛的分頁問題。不知道我在這裏錯過什麼 – oscarQ