2011-09-21 116 views
-1

我已經開始在Ubuntu 11使用scrapy和麪對的問題。具體地,在下面的代碼解析功能不執行,儘管終端顯示所執行的蜘蛛和成功Scrapy「分析」函數沒有被執行

from scrapy.contrib.spiders import CrawlSpider 
from scrapy.selector import HtmlXPathSelector 



class myTestSpider(CrawlSpider): 
    name="go4mumbai.com" 
    domain_name = "go4mumbai.com" 
    start_urls = ["http://www.go4mumbai.com/Mumbai_Bus_Route.php?busno=1"] 

def parse(self, response): 
    hxs = HtmlXPathSelector(response) 
    stopNames=hxs.select('//table[@cellspacing="2"]/tr/td[2]/a/text()').extract() 
    print len(stopNames) 

SPIDER = myTestSpider() 

以下關閉是從終端

[email protected]:~/Desktop/ScrappyTest/basetest$ sudo scrapy crawl go4mumbai.com 
2011-09-21 15:33:56+0530 [scrapy] INFO: Scrapy 0.12.0.2528 started (bot: basetest) 
2011-09-21 15:33:56+0530 [scrapy] DEBUG: Enabled extensions: TelnetConsole,  SpiderContext, WebService, CoreStats, MemoryUsage, CloseSpider 
2011-09-21 15:33:56+0530 [scrapy] DEBUG: Enabled scheduler middlewares:  DuplicatesFilterMiddleware 
2011-09-21 15:33:56+0530 [scrapy] DEBUG: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, RedirectMiddleware, CookiesMiddleware, HttpCompressionMiddleware, DownloaderStats 
2011-09-21 15:33:56+0530 [scrapy] DEBUG: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware 
2011-09-21 15:33:56+0530 [scrapy] DEBUG: Enabled item pipelines: 
2011-09-21 15:33:56+0530 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6023 
2011-09-21 15:33:56+0530 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080 
2011-09-21 15:33:56+0530 [go4mumbai.com] INFO: Spider opened 
2011-09-21 15:33:58+0530 [go4mumbai.com] DEBUG: Crawled (200) <GET http://www.go4mumbai.com/Mumbai_Bus_Route.php?busno=1> (referer: None) 
2011-09-21 15:33:58+0530 [go4mumbai.com] INFO: Closing spider (finished) 
2011-09-21 15:33:58+0530 [go4mumbai.com] INFO: Spider closed (finished) 

響應是否有的一些部分我錯過了代碼?請指教..

+0

嘗試使用''而不是打印LEN(stopNames)''self.log(LEN(stopNames))。我不確定,但我認爲打印在Crawler本身內不起作用。編輯:沒關係,剛纔試了一下,它似乎並沒有成爲一個問題 – naeg

回答

1

parse()功能似乎不屬於你的蜘蛛類。 爲一個縮進縮進整個函數,因此它屬於該類並被調用。

+0

我沒有縮進功能,但仍然沒有輸出。 – Rupin

+0

終端輸出還是一樣嗎?您能否更新原始文章中的代碼? – naeg

+0

我的壞!我確實得到了一個輸出:) – Rupin