處理urllib2.URLError時獲取URL

這與urllib2具體有關，但更一般的是自定義異常處理。如何通過引發異常將其他信息傳遞給另一個模塊中的調用函數？我假設我會重新使用自定義異常類，但我不確定技術細節。處理urllib2.URLError時獲取URL

我沒有用我試過的和失敗的方法污染示例代碼，而只是簡單地將它呈現爲大部分空白的石板。我的最終目標是在樣本的最後一行工作。

#mymod.py 
import urllib2 

def openurl(): 
    req = urllib2.Request("http://duznotexist.com/") 
    response = urllib2.urlopen(req) 

#main.py 
import urllib2 
import mymod 

try: 
    mymod.openurl() 
except urllib2.URLError as e: 
    #how do I do this? 
    print "Website (%s) could not be reached due to %s" % (e.url, e.reason)

來源

2011-06-28 mwolfe02

您可以添加信息，然後重新引發異常。

#mymod.py 
import urllib2 

def openurl(): 
    req = urllib2.Request("http://duznotexist.com/") 
    try: 
     response = urllib2.urlopen(req) 
    except urllib2.URLError as e: 
     # add URL and reason to the exception object 
     e.url = "http://duznotexist.com/" 
     e.reason = "URL does not exist" 
     raise e # re-raise the exception, so the calling function can catch it 

#main.py 
import urllib2 
import mymod 

try: 
    mymod.openurl() 
except urllib2.URLError as e: 
    print "Website (%s) could not be reached due to %s" % (e.url, e.reason)

來源

2011-06-28 15:58:38

+1是的，這就是我一直在尋找的。我覺得這很簡單，但我只是沒有通過谷歌或試驗錯誤到達那裏。 – mwolfe02

urlib2.urlopen（）將遵循重定向 - 所以'e.url_original'會更合適。我還沒有弄清楚如何得到觸發URLError的'url_actual'。我不想挑剔這裏。如果您打開a.com，並且它301重定向到b.com，則urlopen將自動執行此操作，因爲引發了一個帶有重定向的HTTPError。如果b.com導致URLError，則上面的代碼將a.com標記爲不存在 - 當它完成時並且完美運行時，它只會指向b.com中的錯誤網址。 –

'e.reason =「URL不存在''會提供'AttributeError：無法設置屬性' – histrio

我不認爲重新提出異常是解決此問題的適當方法。

正如@Jonathan Vanasco說，

if you're opening a.com , and it 301 redirects to b.com , urlopen will automatically follow that because an HTTPError with a redirect was raised. if b.com causes the URLError , the code above marks a.com as not existing

我的解決辦法是覆蓋redirect_request的urllib2.HTTPRedirectHandler

import urllib2 

class NewHTTPRedirectHandler(urllib2.HTTPRedirectHandler): 
    def redirect_request(self, req, fp, code, msg, headers, newurl): 
     m = req.get_method() 
     if (code in (301, 302, 303, 307) and m in ("GET", "HEAD") 
      or code in (301, 302, 303) and m == "POST"): 
      newurl = newurl.replace(' ', '%20') 
      newheaders = dict((k,v) for k,v in req.headers.items() 
           if k.lower() not in ("content-length", "content-type") 
          ) 
      # reuse the req object 
      # mind that req will be changed if redirection happends 
      req.__init__(newurl, 
       headers=newheaders, 
        origin_req_host=req.get_origin_req_host(), 
        unverifiable=True) 
      return req 
     else: 
      raise HTTPError(req.get_full_url(), code, msg, headers, fp) 

opener = urllib2.build_opener(NewHTTPRedirectHandler) 
urllib2.install_opener(opener) 
# mind that req will be changed if redirection happends 
#req = urllib2.Request('http://127.0.0.1:5000') 
req = urllib2.Request('http://www.google.com/') 

try: 
    response = urllib2.urlopen(req) 
except urllib2.URLError as e: 
    print 'error' 
    print req.get_full_url() 
else: 
    print 'normal' 
    print response.geturl()

讓我們嘗試的URL重定向到一個未知的網址：

import os 
from flask import Flask,redirect 

app = Flask(__name__) 

@app.route('/') 
def hello(): 
    # return 'hello world' 
    return redirect("http://a.com", code=302) 

    if __name__ == '__main__': 
    port = int(os.environ.get('PORT', 5000)) 
    app.run(host='0.0.0.0', port=port)

其結果是：

error 
http://a.com/ 

normal 
http://www.google.com/

來源

2017-02-23 16:50:36 superhan

處理urllib2.URLError時獲取URL

回答

相關問題