如何使用scrapy將抓取頁面鏈接保存到項目中？

這是我的蜘蛛頁面：如何使用scrapy將抓取頁面鏈接保存到項目中？

rules = (
     Rule(LinkExtractor(allow=r'torrents-details\.php\?id=\d*'), callback='parse_item', follow=True), 
    ) 

    def parse_item(self, response): 
     item = MovieNotifyItem() 
     item['title'] = response.xpath('//h5[@class="col s12 light center teal darken-3 white-text"]/text()').extract_first() 
     item['size'] = response.xpath('//*[@class="torrent-info"]//tr[1]/td[2]/text()').extract_first() 
     item['catagory'] = response.xpath('//*[@class="torrent-info"]//tr[2]/td[2]/text()').extract_first() 
     yield item

現在我想的頁面鏈接保存到一個項目說項目[「page_link」]，它爬這段代碼：

rules = (
     Rule(LinkExtractor(allow=r'torrents-details\.php\?id=\d*'), callback='parse_item', follow=True), 
    )

我該怎麼辦那？在此先感謝

來源

2016-10-09 Mohib

如果我理解正確，你正在尋找response.url：

def parse_item(self, response): 
    item = MovieNotifyItem() 
    item['url'] = response.url # "url" field should be defined for "MovieNotifyItem" Item class 
    # ... 
    yield item

來源

2016-10-09 04:46:38 alecxe

哦，太好了！得到它，謝謝：D – Mohib

我還有一個問題，如果我有另一個規則爬到下一頁，我想保存，如何做到這一點？ @alecxe – Mohib

@Mohib我認爲你可以在這種情況下得到'referer'，請參閱http://stackoverflow.com/questions/12054958/scrapyhow-to-print-request-referrer。 – alecxe

如何使用scrapy將抓取頁面鏈接保存到項目中？

回答

相關問題