2013-07-13 34 views
1

我想從主解析函數調用一個解析函數,但它不工作。scrapy的請求函數沒有被調用

下面是代碼:

class CodechefSpider(CrawlSpider): 
    name = "codechef_crawler" 
    allowed_domains = ["codechef.com"] 
    start_urls = ["http://www.codechef.com/problems/easy/","http://www.codechef.com/problems/medium/","http://www.codechef.com/problems/hard/","http://www.codechef.com/problems/challenege/"] 

    rules = (Rule(SgmlLinkExtractor(allow=('/problems/[A-Z,0-9,-]+')), callback='parse_item'),) 

    def parse_solution(self,response): 

     hxs = HtmlXPathSelector(response) 
     x = hxs.select("//tr[@class='kol']//td[8]").exctract() 
     f = open('test/'+response.url.split('/')[-1]+'.txt','wb') 
     f.write(x.encode("utf-8")) 
     f.close() 



    def parse_item(self, response): 
     hxs = HtmlXPathSelector(response) 
     item = Problem() 
     item['title'] = hxs.select("//table[@class='pagetitle-prob']/tr/td/h1/text()").extract() 
     item['content'] = hxs.select("//div[@class='node clear-block']//div[@class='content']").extract() 
     filename = str(item['title'][0]) 
     solutions_url = 'http://www.codechef.com/status/' + response.url.split('/')[-1] + '?language=All&status=15&handle=&sort_by=Time&sorting_order=asc' 
     Request(solutions_url, callback = self.parse_solution) 
     f = open('problems/'+filename+'.html','wb') 
     f.write("<div style='width:800px;margin:50px'>") 
     for i in item['content']: 
      f.write(i.encode("utf-8")) 
     f.write("</div>") 
     f.close() 

解析溶液方法不被調用。蜘蛛運行沒有任何錯誤。

回答

2

你應該把yield Request(solutions_url, callback = self.parse_solution)而不僅僅是Request(solutions_url, callback = self.parse_solution)