2014-01-07 81 views
0

我有在Scrapy教程創建Scrapy蜘蛛問題:Scrapy關鍵錯誤

http://doc.scrapy.org/en/latest/intro/tutorial.html#our-first-spider

以下是我在我的蜘蛛/ dmoz_spider.py文件:

class DmozSpider(object): 
    name = "dmoz" 
    allowed_domains = ["dmoz.org"] 
    start_urls = [ 
    "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/", 
    "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/" 
    ] 

    @classmethod 
    def from_crawler(cls, crawler): 
    spider = crawler.spiders 
    return cls(spider) 

    def parse(self, response): 
    filename = response.url.split("/")[-2] 
    open(filename, 'wb').write(response.body) 

的好消息是我很確定蜘蛛正在創建。壞消息是我得到這個錯誤:

(scrapestat)unknownc8e0eb148153:tutorial christopherspears$ scrapy crawl dmoz 
Traceback (most recent call last): 
    File "/Users/christopherspears/.virtualenvs/scrapestat/bin/scrapy", line 4, in <module> 
    execute() 
    File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/cmdline.py", line 143, in execute 
    _run_print_help(parser, _run_command, cmd, args, opts) 
    File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/cmdline.py", line 89, in _run_print_help 
    func(*a, **kw) 
    File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/cmdline.py", line 150, in _run_command 
    cmd.run(args, opts) 
    File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/commands/crawl.py", line 48, in run 
    spider = crawler.spiders.create(spname, **opts.spargs) 
    File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/spidermanager.py", line 44, in create 
    raise KeyError("Spider not found: %s" % spider_name) 
KeyError: 'Spider not found: dmoz' 

不知道是什麼問題。任何提示?

回答

1

DmozSpider應繼承BaseSpider(或Spider,取決於您的scrapy版本)。所以,做一個跟隨改變你的代碼:

from scrapy.spider import BaseSpider 

class DmozSpider(BaseSpider): 
    ... 

我想,我自己當蜘蛛類從KeyError異常升高對象繼承。

+0

太棒了!謝謝!出於好奇,我還需要from_crawler方法嗎? –