2014-02-21 16 views
0

這可能是一些基本的東西,但我找不到任何最近的(不推薦使用的)示例。考慮下面的代碼爲什麼我的scrapy腳本丟失了?

#This is the tutorial project for scrapy 

from scrapy.item import Item, Field 

class DmozItem(Item): 
    title = Field() 
    link = Field() 
    desc = Field() 

from scrapy.spider import Spider 

class DmozSpider(Spider): 
    name = "dmoz" 
    allowed_domains = ["dmoz.org"] 
    start_urls = [ 
     "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/", 
     "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/" 

    ] 

    def parse(self, response): 
     filename = response.url.split("/") [-2] 
     open(filename, 'wb').write(response.body) 

我收到此錯誤信息

[email protected] ~/Desktop/Scrapy_Projects/tutorial $ scrapy list 
[email protected] ~/Desktop/Scrapy_Projects/tutorial $ scrapy crawl dmoz 
2014-02-21 15:24:37-0400 [scrapy] INFO: Scrapy 0.14.4 started (bot: tutorial) 
2014-02-21 15:24:37-0400 [scrapy] DEBUG: Enabled extensions: LogStats, TelnetConsole, CloseSpider, WebService, CoreStats, MemoryUsage, SpiderState 
2014-02-21 15:24:37-0400 [scrapy] DEBUG: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, RedirectMiddleware, CookiesMiddleware, HttpCompressionMiddleware, ChunkedTransferMiddleware, DownloaderStats 
2014-02-21 15:24:37-0400 [scrapy] DEBUG: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware 
2014-02-21 15:24:37-0400 [scrapy] DEBUG: Enabled item pipelines: 
Traceback (most recent call last): 
    File "/usr/bin/scrapy", line 4, in <module> 
    execute() 
    File "/usr/lib/python2.7/dist-packages/scrapy/cmdline.py", line 132, in execute 
    _run_print_help(parser, _run_command, cmd, args, opts) 
    File "/usr/lib/python2.7/dist-packages/scrapy/cmdline.py", line 97, in _run_print_help 
    func(*a, **kw) 
    File "/usr/lib/python2.7/dist-packages/scrapy/cmdline.py", line 139, in _run_command 
    cmd.run(args, opts) 
    File "/usr/lib/python2.7/dist-packages/scrapy/commands/crawl.py", line 43, in run 
    spider = self.crawler.spiders.create(spname, **opts.spargs) 
    File "/usr/lib/python2.7/dist-packages/scrapy/spidermanager.py", line 43, in create 
    raise KeyError("Spider not found: %s" % spider_name) 
KeyError: 'Spider not found: dmoz' 

我想,這可能是一些很基本的,我試圖尋找例子,我可以瀏覽並看到它是什麼,但我沒有沒有發現任何我認爲最近的東西。

感謝您的幫助!

+0

你已經試過了什麼? 另外,在這裏發佈一些代碼,不要連接我們外面。描述你如何理解問題。 –

回答

1

首先將下面的代碼是在一個Python文件名爲items.py

from scrapy.item import Item, Field 

class DmozItem(Item): 
     title = Field() 
     link = Field() 
     desc = Field() 

下面是代碼的其餘部分(所有這些步驟應使用由hwatkins提到startproject命令命令後跟隨)從spider.py文件應該在您的項目中的蜘蛛文件夾內創建。

如果您想在蜘蛛中使用它,您還需要在您的蜘蛛中導入DmozItem作爲from dmoz.items import DmozItem

我建議你再次仔細地按照教程

+0

雅,我錯過了關於將部分拆分爲不同文件的部分。我多次閱讀了該教程,但顯然錯過了這一部分。謝謝,對不起,這是一個愚蠢的錯誤。 – Ajacmac

1

你按照指示:

scrapy startproject tutorial 

這將創建項目,然後保存你的腳本在名爲dmoz_spider.py文件tutorial/spiders目錄

我只是你的腳本嘗試過了,它的工作下好。

相關問題