如何捕捉404錯誤urllib.urlretrieve

背景：我使用urllib.urlretrieve，而不是urllib*模塊中的任何其他功能，因爲支持鉤子函數（請參閱下面的reporthook）..它用於顯示文本進度條。這是Python> = 2.6。如何捕捉404錯誤urllib.urlretrieve

>>> urllib.urlretrieve(url[, filename[, reporthook[, data]]])

然而，urlretrieve是如此啞它葉無道檢測到HTTP請求的狀態（例如：它是404或200？）。

>>> fn, h = urllib.urlretrieve('http://google.com/foo/bar') 
>>> h.items() 
[('date', 'Thu, 20 Aug 2009 20:07:40 GMT'), 
('expires', '-1'), 
('content-type', 'text/html; charset=ISO-8859-1'), 
('server', 'gws'), 
('cache-control', 'private, max-age=0')] 
>>> h.status 
'' 
>>>

什麼是最有名的下載方式與鉤狀支持遠程HTTP文件（以顯示進度條）和體面的HTTP錯誤處理？

來源

2009-08-20 Sridhar Ratnakumar

在您的請求中未提供HTTP狀態應該可能被認爲是stdlib中的錯誤（但請查看下面更好的庫，請求） – 2016-03-17 20:37:48

退房urllib.urlretrieve的完整代碼：

def urlretrieve(url, filename=None, reporthook=None, data=None): 
    global _urlopener 
    if not _urlopener: 
    _urlopener = FancyURLopener() 
    return _urlopener.retrieve(url, filename, reporthook, data)

換句話說，你可以用urllib.FancyURLopener（它的公共urllib的API的一部分）。您可以覆蓋http_error_default檢測404：

class MyURLopener(urllib.FancyURLopener): 
    def http_error_default(self, url, fp, errcode, errmsg, headers): 
    # handle errors the way you'd like to 

fn, h = MyURLopener().retrieve(url, reporthook=my_report_hook)

來源

2009-08-20 21:11:37 orip

我不想指定處理程序;它是否會拋出像urllib2.urlopen這樣的異常？ – 2009-08-20 21:14:40

讓它很容易扔掉。 FancyURLopener子類拋出的URLopener，所以你可以嘗試調用基類的實現：def http_error_default（...）：URLopener.http_error_default（...） – orip 2009-08-20 21:35:26

這是一個非常好的解決方案，我現在就自己使用它。 – 2010-01-02 22:34:29

的URL開瓶器對象的「retreive」方法支持reporthook並拋出404上

http://docs.python.org/library/urllib.html#url-opener-objects

來源

2009-08-20 21:13:46 Mark

是的，但它不支持重定向等。 – 2009-08-20 21:15:47

例外，您應該使用：

import urllib2 

try: 
    resp = urllib2.urlopen("http://www.google.com/this-gives-a-404/") 
except urllib2.URLError, e: 
    if not hasattr(e, "code"): 
     raise 
    resp = e 

print "Gave", resp.code, resp.msg 
print "=" * 80 
print resp.read(80)

編輯：這裏的基本原理是，除非你期望特殊的st吃了，它是一個例外，你可能甚至都沒有考慮過 - 所以不是讓你的代碼在不成功的時候繼續運行，而是默認的行爲 - 相當明智 - 禁止它的運行執行。

來源

2010-02-04 20:17:57 lericson

鉤狀支持？ – 2010-02-05 16:02:52

Sridhar，請參閱http://stackoverflow.com/a/9740603/819417 – 2012-03-16 16:07:36

如何捕捉404錯誤urllib.urlretrieve

回答

相關問題