2017-02-17 86 views
1

試圖產生「數字」或可能得到的start_url然後解析start_url獲得數:Python的Scrapy打印START_URL或可變

class EbaypriceSpider(Spider): 
    name = "ebayprice" 
    allowed_domains = ["www.ebay.com"] 
    start_urls = [] 
    with open('Numbers.csv', 'rb') as omcan_numbers: 
     number_list = csv.reader(omcan_numbers) 
     for number in number_list: 
      start_urls.append('http://www.ebay.com/sch/Omcan' + str(number)) 


    def parse(self, response): 
     # DO stuff then call parse_page2 


    def parse_page2(self, response): 
     print number 
     # I want to get get start url or number 

回答

2

,而不是start_urls使用start_requests方法:

class EbaypriceSpider(Spider): 
    name = "ebayprice" 
    allowed_domains = ["www.ebay.com"] 

    def start_requests(self): 
     with open('Numbers.csv','rb') as omcan_numbers: 
      number_list = csv.reader(omcan_numbers) 
      for number in number_list: 
       url = 'http://www.ebay.com/sch/Omcan'+ str(number) 
       yield Request(url, meta={'start_url':url}, callback=self.parse) 

    def parse(self, response): 
     # DO stuff then call parse_page2 
     ... 
     # keep passing the `meta` argument from previous request 
     yield Request(some_other_url, meta=response.meta, callback=self.parse_page2) 

    def parse_page2(self, response): 
     # i want to get get start url or number 
     start_url = response.meta['start_url'] 
+0

謝謝你這個工作。任何想法如何獲得「數」與刪除start_url? response.meta ['number']似乎不起作用 –

+0

'meta'是一個字典,它將包含你正在添加的東西。像'meta = {'start_url':url,'some_number':number}將其添加到'start_requests'上 – eLRuLL

+0

WOW NICE THANKYOU –