我打開scrapy外殼如下在scrapy外殼中取指不更新objects.What我在這裏失蹤?
scrapy shell "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/"
這給了我:
[s] Available Scrapy objects:
[s] hxs <HtmlXPathSelector xpath=None data=u'<html><head><meta http-equiv="Content-Ty'>
[s] item {}
[s] request <GET http://www.dmoz.org/Computers/Programming/Languages/Python/Books/>
[s] response <200 http://www.dmoz.org/Computers/Programming/Languages/Python/Books/>
[s] settings <CrawlerSettings module=None>
[s] spider <BaseSpider 'default' at 0x9e1d3ec>
[s] Useful shortcuts:
[s] shelp() Shell help (print this help)
[s] fetch(req_or_url) Fetch request (or URL) and update local objects
[s] view(response) View response in a browser
In [1]: hxs.select('//title')
Out[1]: [<HtmlXPathSelector xpath='//title' data=u'<title>Open Directory - Computers: Progr'>]
如預期從響應標題:
In [1]: hxs.select('//title')
Out[1]: [<HtmlXPathSelector xpath='//title' data=u'<title>Open Directory - Computers: Progr'>]
現在我跟進與一個簡單的提取:
In [2]: fetch("http://www.google.com")
外殼輸出表明對象已更新:
In [2]: fetch("http://www.google.com")
2013-10-18 23:10:09+0530 [default] DEBUG: Redirecting (302) to <GET http://www.google.co.in/?gws_rd=cr&ei=eHJhUo2sOobSrQeM5ICAAg> from <GET http://www.google.com>
2013-10-18 23:10:09+0530 [default] DEBUG: Crawled (200) <GET http://www.google.co.in/?gws_rd=cr&ei=eHJhUo2sOobSrQeM5ICAAg> (referer: None)
[s] Available Scrapy objects:
[s] hxs <HtmlXPathSelector xpath=None data=u'<html itemscope="" itemtype="http://sche'>
[s] item {}
[s] request <GET http://www.google.com>
[s] response <200 http://www.google.co.in/?gws_rd=cr&ei=eHJhUo2sOobSrQeM5ICAAg>
[s] settings <CrawlerSettings module=None>
[s] spider <BaseSpider 'default' at 0x9e1d3ec>
[s] Useful shortcuts:
[s] shelp() Shell help (print this help)
[s] fetch(req_or_url) Fetch request (or URL) and update local objects
[s] view(response) View response in a browser
然而,我發現,他們沒有。視圖(響應)顯示我DMOZ頁
並提取標題給出了同樣的舊:
In [3]: hxs.select('//title')
Out[3]: [<HtmlXPathSelector xpath='//title' data=u'<title>Open Directory - Computers: Progr'>]
缺少什麼我在這裏?
謝謝!
你正在使用什麼scrapy/python/ipython版本?這個對我有用。 – Rolando