2011-06-28 25 views
7

這與urllib2具體有關,但更一般的是自定義異常處理。如何通過引發異常將其他信息傳遞給另一個模塊中的調用函數?我假設我會重新使用自定義異常類,但我不確定技術細節。處理urllib2.URLError時獲取URL

我沒有用我試過的和失敗的方法污染示例代碼,而只是簡單地將它呈現爲大部分空白的石板。我的最終目標是在樣本的最後一行工作。

#mymod.py 
import urllib2 

def openurl(): 
    req = urllib2.Request("http://duznotexist.com/") 
    response = urllib2.urlopen(req) 

#main.py 
import urllib2 
import mymod 

try: 
    mymod.openurl() 
except urllib2.URLError as e: 
    #how do I do this? 
    print "Website (%s) could not be reached due to %s" % (e.url, e.reason) 

回答

8

您可以添加信息,然後重新引發異常。

#mymod.py 
import urllib2 

def openurl(): 
    req = urllib2.Request("http://duznotexist.com/") 
    try: 
     response = urllib2.urlopen(req) 
    except urllib2.URLError as e: 
     # add URL and reason to the exception object 
     e.url = "http://duznotexist.com/" 
     e.reason = "URL does not exist" 
     raise e # re-raise the exception, so the calling function can catch it 

#main.py 
import urllib2 
import mymod 

try: 
    mymod.openurl() 
except urllib2.URLError as e: 
    print "Website (%s) could not be reached due to %s" % (e.url, e.reason) 
+0

+1是的,這就是我一直在尋找的。我覺得這很簡單,但我只是沒有通過谷歌或試驗錯誤到達那裏。 – mwolfe02

+1

urlib2.urlopen()將遵循重定向 - 所以'e.url_original'會更合適。 我還沒有弄清楚如何得到觸發URLError的'url_actual'。 我不想挑剔這裏。如果您打開a.com,並且它301重定向到b.com,則urlopen將自動執行此操作,因爲引發了一個帶有重定向的HTTPError。如果b.com導致URLError,則上面的代碼將a.com標記爲不存在 - 當它完成時並且完美運行時,它只會指向b.com中的錯誤網址。 –

+0

'e.reason =「URL不存在''會提供'AttributeError:無法設置屬性' – histrio

0

我不認爲重新提出異常是解決此問題的適當方法。

正如@Jonathan Vanasco說,

if you're opening a.com , and it 301 redirects to b.com , urlopen will automatically follow that because an HTTPError with a redirect was raised. if b.com causes the URLError , the code above marks a.com as not existing

我的解決辦法是覆蓋redirect_requesturllib2.HTTPRedirectHandler

import urllib2 

class NewHTTPRedirectHandler(urllib2.HTTPRedirectHandler): 
    def redirect_request(self, req, fp, code, msg, headers, newurl): 
     m = req.get_method() 
     if (code in (301, 302, 303, 307) and m in ("GET", "HEAD") 
      or code in (301, 302, 303) and m == "POST"): 
      newurl = newurl.replace(' ', '%20') 
      newheaders = dict((k,v) for k,v in req.headers.items() 
           if k.lower() not in ("content-length", "content-type") 
          ) 
      # reuse the req object 
      # mind that req will be changed if redirection happends 
      req.__init__(newurl, 
       headers=newheaders, 
        origin_req_host=req.get_origin_req_host(), 
        unverifiable=True) 
      return req 
     else: 
      raise HTTPError(req.get_full_url(), code, msg, headers, fp) 

opener = urllib2.build_opener(NewHTTPRedirectHandler) 
urllib2.install_opener(opener) 
# mind that req will be changed if redirection happends 
#req = urllib2.Request('http://127.0.0.1:5000') 
req = urllib2.Request('http://www.google.com/') 

try: 
    response = urllib2.urlopen(req) 
except urllib2.URLError as e: 
    print 'error' 
    print req.get_full_url() 
else: 
    print 'normal' 
    print response.geturl() 

讓我們嘗試的URL重定向到一個未知的網址:

import os 
from flask import Flask,redirect 

app = Flask(__name__) 

@app.route('/') 
def hello(): 
    # return 'hello world' 
    return redirect("http://a.com", code=302) 

    if __name__ == '__main__': 
    port = int(os.environ.get('PORT', 5000)) 
    app.run(host='0.0.0.0', port=port) 

其結果是:

error 
http://a.com/ 

normal 
http://www.google.com/