如何在Python中停止一個調度器迭代

我是Python新手（通常編程）。我有一個pcode，通過PyCURL定期下載網站內容並做一些搜索。我在While-True內部使用了scheduler來建立一個無限循環，我在其中創建一個對象並調用其方法start（）來獲取網站並執行一些搜索.. getbody（）方法無法取得網站時發生問題，由於連接問題（或其他原因）。 BeautifulSoup期望字符串，否則會引發錯誤。如何在Python中停止一個調度器迭代

如何在getbody（）方法中發生錯誤/異常並等待另一個線程時停止調度程序的線程？由於getbody（）方法返回空字符串會浪費CPU時間。

#Parser_module 
class Parser(object): 
    def __init__(self): 
     self.body = BeautifulSoup(self.getbody(), "lxml") 
     self.buffer = BytesIO() 

    def getbody(self): 
     # some code to set pycurl up 
     try: 
      c.perform() 
     except pycurl.error: 
      print("connection error") 
      # returns an emptry string to feed the BeautifulSoup with 
      return "" 
     body = self.buffer.getvalue().decode("utf-8") 
     return body 

    def start(self): 
     #calls other functions to perform some searching 
     self.otherfunction() 

    def otherfunction(self): 
     . 
     . 
     . 

#Scheduler module 
import Parser_module 
from threading import Timer 

def start_search(): 
    parser = Parser() 
    parser.start() 
    t = Timer(20.0, start_search) 
    t.start()

來源

2015-08-25 Martin Pelikan

我不能跟隨你的while-True發生在哪裏，你可以提供更多的細節？如果在循環中調用引發錯誤的東西，您可以嘗試，除了特定的錯誤並在except子句中調用'''continue'''，它將跳過一個interation並忽略接下來的事情。 –

@Francisco Vargas - 我在While循環中調用了start_search（）函數，但是正如我想的那樣，它沒有用處，因爲timer本身提供了無限的代碼處理。但不知道「繼續」。很有用。謝謝！ –

而是在Parser.__init__抓取網址時，你可能只是這樣做，在Parser.start，如果發生錯誤返回。

class Parser(object): 
    def __init__(self): 
     self.body = None 
     self.buffer = BytesIO() 

    def start(): 
     data = self.getbody() 
     if not data: 
      return 
     self.body = BeautifulSoup(data, "lxml") 
     self.otherfunction() 

    def getbody(self): 
     ... 

    def otherfunction(self): 
     ...

在一個側面說明，我建議你使用的不是pycurl的好得多requests庫，如果你能。還可以查看Python風格指南PEP8，例如有關如何命名事物的一些建議。

來源

2015-08-25 10:21:22

我喜歡在'start（）'函數中獲取URL的想法，它使得很多事情更容易。並且一定會查看請求庫。謝謝。 –

很高興能幫到你！隨時接受和/或upvote我的答案;） –

如何在Python中停止一個調度器迭代

回答

相關問題