0
類的進口比如我創造了這個類parse()
:Python的 - 從模塊
class PitchforkSpider(scrapy.Spider):
name = "pitchfork_reissues"
allowed_domains = ["pitchfork.com"]
#creates objects for each URL listed here
start_urls = [
"http://pitchfork.com/reviews/best/reissues/?page=1",
"http://pitchfork.com/reviews/best/reissues/?page=2",
"http://pitchfork.com/reviews/best/reissues/?page=3",
]
def parse(self, response):
for sel in response.xpath('//div[@class="album-artist"]'):
item = PitchforkItem()
item['artist'] = sel.xpath('//ul[@class="artist-list"]/li/text()').extract()
item['reissue'] = sel.xpath('//h2[@class="title"]/text()').extract()
return item
然後我導入module
其中class
屬於:
from blogs.spiders.pitchfork_reissues_feed import *
,並試圖調用parse()
在另一上下文:
def reissues(self):
pitchfork_reissues = PitchforkSpider()
reissues = pitchfork_reissues.parse('response')
print (reissues)
但我得到以下錯誤:
pitchfork_reissues.parse('response')
File "/Users/vitorpatalano/Documents/Code/Soup/Apps/myapp/blogs/blogs/spiders/pitchfork_reissues_feed.py", line 21, in parse
for sel in response.xpath('//div[@class="album-artist"]'):
AttributeError: 'str' object has no attribute 'xpath'
我錯過了什麼?
給了我下面的回溯:'reissues = pitchfork_reissues.parse(response) NameError:全局名稱'response'未定義' –
那麼你需要一個scrapy.http.Response實例,它顯式地被「下載由下載者)「。參見[文檔](http://doc.scrapy.org/en/latest/topics/request-response.html#response-objects):) – Jasper