scrapy管道類的訪問實例

我想訪問變量self.cursor以利用活動的postgreSQL連接，但我無法弄清楚如何訪問scrapy的管道類實例。scrapy管道類的訪問實例

class ScrapenewsPipeline(object): 

    def open_spider(self, spider): 
     self.connection = psycopg2.connect(
     host= os.environ['HOST_NAME'], 
     user=os.environ['USERNAME'], 
     database=os.environ['DATABASE_NAME'], 
     password=os.environ['PASSWORD']) 
     self.cursor = self.connection.cursor() 
     self.connection.set_session(autocommit=True) 


    def close_spider(self, spider): 
     self.cursor.close() 
     self.connection.close() 


    def process_item(self, item, spider): 
     print ("Some Magic Happens Here") 


    def checkUrlExist(self, item): 
     print("I want to call this function from my spider to access the 
    self.cursor variable")

請注意，我知道我可以用yield item得到process_item訪問，但該功能是做其他的東西，我想通過self.cursor在checkUrlExist連接的訪問，並能夠從調用類的實例我蜘蛛隨意！謝謝。

來源

2017-12-03 atb00ker

'objectName.cursor'？ – RottenCandy

objectName在我不知道的時候，管道類會在蜘蛛自動啓動時調用，我想將一個實例掛接到該類的實例上！ :) – atb00ker

也許你應該考慮'getattr' https://stackoverflow.com/questions/4075190/what-is-getattr-exactly-and-how-do-i-use-it#4076099 – RottenCandy

你可以在這裏做spider.variable_name來訪問你所有的蜘蛛類變量。

class MySpider(scrapy.Spider): 

     name = "myspider" 

     any_variable = "any_value"

這裏你管線

class MyPipeline(object): 

    def process_item(self, item, spider): 

     spider.any_variable

我建議你創建就像我宣佈我的例子any_variable你的蜘蛛類的連接，那將是在使用self.any_variable您的蜘蛛在你的管道進入，它將可以通過spider.any_variable

來源

2017-12-03 10:53:39 Umair

我有60只蜘蛛，在這種情況下，他們都有自己的PostgreSQL連接，我只有有限的內存，因爲它不會對我有用。 – atb00ker

scrapy管道類的訪問實例

回答

相關問題