我遵循Scrapy教程文檔http://media.readthedocs.org/pdf/scrapy/0.14/scrapy.pdf,並且我已經驗證了items.py和dmoz_spider.py是正確輸入的(不是剪切&)。Scrapy教程例外
第一個「嗯......」我的部分是該指令:
這是我們的第一個蜘蛛的代碼;將它保存在dmoz/spiders目錄下的一個名爲dmoz_spider.py的文件中
我正在使用最新版本的Ubuntu,教程/教程/蜘蛛。 (這個當時我的第一個錯誤?)
因此,這裏是我的dmoz_spider.py腳本:
from scrapy.spider import BaseSpider
class DmozSpider(BaseSpider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
def parse(self, response):
filename = response.url.split("/")[-2]
open(filename, 'wb').write(response.body)
在我的終端I型
scrapy crawl dmoz
而且我得到這個:
2012-10-08 13:20:22-0700 [scrapy] INFO: Scrapy 0.12.0.2546 started (bot: tutorial)
2012-10-08 13:20:22-0700 [scrapy] DEBUG: Enabled extensions: TelnetConsole, SpiderContext, WebService, CoreStats, MemoryUsage, CloseSpider
2012-10-08 13:20:22-0700 [scrapy] DEBUG: Enabled scheduler middlewares: DuplicatesFilterMiddleware
2012-10-08 13:20:22-0700 [scrapy] DEBUG: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, RedirectMiddleware, CookiesMiddleware, HttpCompressionMiddleware, DownloaderStats
2012-10-08 13:20:22-0700 [scrapy] DEBUG: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2012-10-08 13:20:22-0700 [scrapy] DEBUG: Enabled item pipelines:
2012-10-08 13:20:22-0700 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6023
2012-10-08 13:20:22-0700 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080
2012-10-08 13:20:22-0700 [dmoz] INFO: Spider opened
2012-10-08 13:20:22-0700 [dmoz] DEBUG: Crawled (200) <GET http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/> (referer: None)
2012-10-08 13:20:22-0700 [dmoz] ERROR: Spider error processing <http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/> (referer: <None>)
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1178, in mainLoop
self.runUntilCurrent()
File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 800, in runUntilCurrent
call.func(*call.args, **call.kw)
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 362, in callback
self._startRunCallbacks(result)
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 458, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 545, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/usr/lib/python2.7/dist-packages/scrapy/spider.py", line 62, in parse
raise NotImplementedError
exceptions.NotImplementedError:
2012-10-08 13:20:22-0700 [dmoz] DEBUG: Crawled (200) <GET http://www.dmoz.org/Computers/Programming/Languages/Python/Books/> (referer: None)
2012-10-08 13:20:22-0700 [dmoz] ERROR: Spider error processing <http://www.dmoz.org/Computers/Programming/Languages/Python/Books/> (referer: <None>)
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1178, in mainLoop
self.runUntilCurrent()
File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 800, in runUntilCurrent
call.func(*call.args, **call.kw)
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 362, in callback
self._startRunCallbacks(result)
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 458, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 545, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/usr/lib/python2.7/dist-packages/scrapy/spider.py", line 62, in parse
raise NotImplementedError
exceptions.NotImplementedError:
2012-10-08 13:20:22-0700 [dmoz] INFO: Closing spider (finished)
2012-10-08 13:20:22-0700 [dmoz] INFO: Spider closed (finished)
在我的搜索中,我看到有人說扭曲可能沒有安裝...但不會安裝,如果我使用用於Scrapy的Ubuntu軟件包安裝程序?
在此先感謝!
¿爲什麼不首先檢查它是否實際安裝?不要相信你的猜測:) – Alfabravo