Scrapy：跳過項目並繼續使用exectuion

我正在做一個RSS蜘蛛。我想繼續蜘蛛忽略當前節點的執行，如果沒有當前項的比賽......到目前爲止，我有這樣的：Scrapy：跳過項目並繼續使用exectuion

 if info.startswith('Foo'): 
      item['foo'] = info.split(':')[1] 
     else: 
      return None

（信息是一個字符串這是從一個XPath的前消毒...）

但我發現了此異常：

exceptions.TypeError: You cannot return an "NoneType" object from a

蜘蛛

所以，我怎麼能igno重新這個節點並繼續執行？

來源

2011-02-18 anders

parse(response): 
    #make some manipulations 
    if info.startswith('Foo'): 
      item['foo'] = info.split(':')[1] 
      return [item] 
     else: 
      return []

但最好不使用的回報，使用yield或者什麼也不做

parse(response): 
    #make some manipulations 
    if info.startswith('Foo'): 
      item['foo'] = info.split(':')[1] 
      yield item 
     else: 
      return

來源

2011-02-18 13:32:37 seriyPS

return []似乎工作正常，謝謝！ – anders 2011-02-18 16:53:20

還有就是我想通了，當我不得不解析過程中跳過的項目無證方法，但在回調函數之外。

只需在解析過程中的任何位置提出StopIteration即可。

class MySpider(Spider): 
    def parse(self, response): 
     value1 = parse_something1() 
     value2 = parse_something1() 
     yield Item(value1, value2) 

    def parse_something1(self): 
     try: 
      return get_some_value() 
     except Exception: 
      self.skip_item() 

    def parse_something2(self): 
     if something_wrong: 
      self.skip_item() 

    def skip_item(self): 
     raise StopIteration

來源

2017-06-11 22:20:01

Scrapy：跳過項目並繼續使用exectuion

回答

相關問題