你應該用你verbose nature of Scrapy's response.
$ scrapy shell http://en.wikipedia.org/wiki/Main_Page
如果你的冗長看起來是這樣的:你有什麼顯示
2014-09-20 23:02:14-0400 [scrapy] INFO: Scrapy 0.14.4 started (bot: scrapybot)
2014-09-20 23:02:14-0400 [scrapy] DEBUG: Enabled extensions: TelnetConsole, CloseSpider, WebService, CoreStats, MemoryUsage, SpiderState
2014-09-20 23:02:15-0400 [scrapy] DEBUG: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, RedirectMiddleware, CookiesMiddleware, HttpCompressionMiddleware, ChunkedTransferMiddleware, DownloaderStats
2014-09-20 23:02:15-0400 [scrapy] DEBUG: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2014-09-20 23:02:15-0400 [scrapy] DEBUG: Enabled item pipelines:
2014-09-20 23:02:15-0400 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6023
2014-09-20 23:02:15-0400 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080
2014-09-20 23:02:15-0400 [default] INFO: Spider opened
2014-09-20 23:02:15-0400 [default] DEBUG: Crawled (200) <GET http://en.wikipedia.org/wiki/Main_Page> (referer: None)
[s] Available Scrapy objects:
[s] hxs <HtmlXPathSelector xpath=None data=u'<html lang="en" dir="ltr" class="client-'>
[s] item {}
[s] request <GET http://en.wikipedia.org/wiki/Main_Page>
[s] response <200 http://en.wikipedia.org/wiki/Main_Page>
[s] settings <CrawlerSettings module=None>
[s] spider <BaseSpider 'default' at 0xb5d95d8c>
[s] Useful shortcuts:
[s] shelp() Shell help (print this help)
[s] fetch(req_or_url) Fetch request (or URL) and update local objects
[s] view(response) View response in a browser
Python 2.7.6 (default, Mar 22 2014, 22:59:38)
Type "copyright", "credits" or "license" for more information.
您詳細將顯示Available Scrapy objects
所以hxs
或sel
取決於你的詳細。對於你的情況hxs
不可用,所以你需要使用'sel'(更新的scrappy版本)。因此,對於一些hxs
是確定的和其他sel
是什麼,他們需要使用
工作就像一個魅力。非常感謝你讓我注意到這一點。 – 2014-09-21 02:52:31
@ MattO'Brien很高興幫助。雖然,不知道爲什麼有人downvoted它.. – alecxe 2014-09-21 03:02:24
奇怪的是,它有幾分鐘前+2。看起來它有2個downvotes然後,只活了10分鐘左右...也只顯示4意見! – 2014-09-21 03:05:33