定製BaseSpider Scrapy

我想爲自定義基本蜘蛛類中的蜘蛛提供一些通用功能。定製BaseSpider Scrapy

通常scrapy蜘蛛繼承scrapy.Spider類。

我試圖創造scrapy的蜘蛛文件夾BaseSpider類，沒有工作

import scrapy 


class BaseSpider(scrapy.Spider): 
    def __init__(self): 
     super(scrapy.Spider).__init__() 

    def parse(self, response): 
     pass

這裏是我的實際蜘蛛

import scrapy 
import BaseSpider 


class EbaySpider(BaseSpider): 
    name = "ebay" 
    allowed_domains = ["ebay.com"] 

    def __init__(self): 
     self.redis = Redis(host='redis', port=6379) 
    # rest of the spider code

給出了這樣的錯誤

TypeError: Error when calling the metaclass bases 
    module.__init__() takes at most 2 arguments (3 given)

然後，我試圖使用多繼承，並讓我的易趣蜘蛛看起來像

class EbaySpider(scrapy.Spider, BaseSpider): 

    name = "ebay" 
    allowed_domains = ["ebay.com"] 

    def __init__(self): 
     self.redis = Redis(host='redis', port=6379) 
    # rest of the spider code

這給

TypeError: Error when calling the metaclass bases 

metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

我在Python以及scrapy新的，我想實現我在這其中不工作我想編碼的PHP風格。

我正在尋找適當的方法。

感謝

更新

改變了初始化簽名按scrapy.Spider

BaseSpider

def __init__(self, *args, **kwargs): 
     super(scrapy.Spider, self).__init__(*args, **kwargs)

EbaySpider

class EbaySpider(BaseSpider): 
    def __init__(self, *args, **kwargs): 
     super(BaseSpider,self).__init__(*args, **kwargs) 
     self.redis = Redis(host='redis', port=6379)

仍然得到

File "/scrapper/scrapper/spiders/ebay.py", line 11, in <module> 
    class EbaySpider(BaseSpider): 
TypeError: Error when calling the metaclass bases 
    module.__init__() takes at most 2 arguments (3 given)

來源

2017-06-17 Raheel Khan

你有__init __（）嗎？ – omdv

init在哪個類中？ –

您在EbaySpider上的第一個錯誤表明__init__存在問題。你是如何定義它的？ – omdv

在scrapy.Spider.__init__signature看看：

def __init__(self, name=None, **kwargs): 
    # ...

子類應該定義__init__方法具有相同的簽名。如果你不關心的姓名和kwargs，只需將它們傳遞給基類：

class BaseSpider(scrapy.Spider): 
    def __init__(self, *args, **kwargs): 
     super().__init__(*args, **kwargs) 

    def parse(self, response): 
     pass

EbaySpider不必從scrapy.Spider繼承，如果它已經從BaseSpider繼承。它也應該有相同的__init__簽名，而且還需要調用super()：

class EbaySpider(BaseSpider): 

    name = "ebay" 
    allowed_domains = ["ebay.com"] 

    def __init__(self, *args, **kwargs): 
     super().__init__(*args, **kwargs) 
     self.redis = Redis(host='redis', port=6379)

（我使用Python的super() 3語法）

編輯

另外還有一種問題：你正在像這樣導入BaseSpider：

import BaseSpider

很可能你有一個名爲BaseSpider（BaseSpider.py文件）的模塊和一個名爲BaseSpider的類。 import BaseSpider給你模塊對象，而不是蜘蛛類。嘗試使用from BaseSpider import BaseSpider，並更好地重命名模塊以避免混淆，並遵循pep-8。

來源

2017-06-17 18:41:57

我試着實施你的建議，但仍然得到相同的錯誤。你能否請檢查更新的部分。我想知道scrapy如何稱它的刮刀，因爲在他們的文檔中他們沒有提到任何與自定義父蜘蛛類相關的任何事情 –

啊哈，你有一個不同的問題。看我的編輯。 –

即使我的Pycharm現在行爲正確：D非常感謝！ Python是如此不同lol –

定製BaseSpider Scrapy

回答

相關問題