2016-12-29 92 views
0

所以我想檢查一個網站來更新我,只要有新的項目發佈。他們不經常更新,所以我相當確定他們什麼時候更新它將成爲感興趣的項目。我想通過選擇一個「起始數字」並計算頁面上的鏈接數量,然後每10分鐘將該數字與鏈接數量進行比較,直到鏈接數量大於起始數字爲止。使用一個功能的輸出作爲另一個功能的輸入

首先,我跑這來獲得鏈接的「起始編號」:

links=[] 
for link in soup.findAll('a'): 
    links.append(link.get('href')) 
start_num = len(links) 

那麼現在這個數字比較鏈接的數量和每5秒:

notify=True 
while notify: 
    try: 
     page = urllib.request.urlopen('web/site/url') 
     soup = bs(page, "lxml") 

     links=[] 
     for link in soup.findAll('a'): 
      links.append(link.get('href')) 

     if len(links) > start_num: 
      message = client.messages.create(to="", from_="",body="") 
      print('notified') 
      notify=False 
     else: 
      print('keep going') 
      time.sleep(60*5) 

    except: 
     print("Going to sleep") 
     time.sleep(60*10) 

哪有我把所有這一切合併爲一個函數,我運行的時候可以存儲鏈接的起始數量,而不用每次檢查鏈接數量時都覆蓋它。

+0

如果你想保留一個函數的狀態,你應該考慮使用類。 –

回答

0

你可以做到這一點至少在兩個方面:裝飾和發電機

裝飾:

def hang_on(func): 

    # soup should be in a visible scope 
    def count_links(): 
     # refresh page? 
     return len(soup.findAll('a')) 

    start_num = count_links() 

    def wrapper(*args, **kwargs): 
     while True: 
      try: 
       new_links = count_links() 
       if new_links > start_num: 
        start_num = new_links 
        return fund(*args, **kwargs) 
       print('keep going') 
       time.sleep(60*5)    
      except: 
       print("Going to sleep") 
       time.sleep(60*10)   

    return wrapper 

@hang_on  
def notify(): 
    message = client.messages.create(to="", from_="",body="") 
    print('notified') 

# somewhere in your code, simply: 
notify() 

發電機:

def gen_example(soup): 

    # initialize soup (perhaps from url) 

    # soup should be in a visible scope 
    def count_links(): 
     # refresh page? 
     return len(soup.findAll('a')) 

    start_num = count_links() 

    while True: 
     try: 
      new_links = count_links() 
      if new_links > start_num: 
       start_num = new_links 
       message = client.messages.create(to="", from_="",body="") 
       print('notified') 
       yield True # this is what makes this func a generator 

      print('keep going') 
      time.sleep(60*5)    
     except: 
      print("Going to sleep") 
      time.sleep(60*10)  

# somewhere in your code: 
gen = gen_example(soup) # initialize 

gen.next() # will wait and notify 

# coming soon 
0

我會實現它作爲一類,因爲這個代碼是可讀性強,易於支持。享受:

class Notifier: 
    url = 'web/site/url' 
    timeout = 60 * 10 

    def __links_count(self): 
     page = urllib.request.urlopen(self.url) 
     soup = bs(page, "lxml") 

     links=[] 
     for link in soup.findAll('a'): 
      links.append(link.get('href')) 

     return len(links) 

    def __notify(self): 
     client.messages.create(to="", from_="", body="") 
     print('notified') 

    def run(self): 
     current_count = self.__links_count() 

     while True: 
      try: 
       new_count = self.__links_count() 

       if new_count > current_count: 
        self.__notify() 
        break 

       sleep(self.timeout) 

      except: 
       print('Keep going') 
       sleep(self.timeout) 

notifier = Norifier() 
notifier.run() 
+0

所以自我在函數中使用的參數(self.url,self.timeout)總是引用未在類中定義的任何方法中定義的變量? – e1v1s