Python HTTP HEAD - 正確處理重定向？

我可以使用的urllib2做HEAD請求，就像這樣： Python HTTP HEAD - 正確處理重定向？

import urllib2 
request = urllib2.Request('http://example.com') 
request.get_method = lambda: 'HEAD' 
urllib2.urlopen(request)

的問題是，它似乎是，當這個進行重定向，它使用GET而不是HEAD。

此HEAD請求的目的是檢查我即將下載的URL的大小和內容類型，以便確保我不下載一些大文檔。（URL由隨機互聯網用戶通過IRC提供）。

如何在重定向後使用HEAD請求？

來源

2012-04-01 Krenair

[要求]（http://docs.python-requests.org/en/latest/index.html）至少聲稱這樣做的正確的方式（至少，它將重定向行爲記錄爲冪等方法，並在文檔中專門調用HEAD）。 – 2012-04-01 19:41:25

類似的解決方案：http://stackoverflow.com/questions/9890815/python-get-headers-only-using-urllib2/9892207#9892207 – newtover 2012-04-01 21:00:21

好問題！如果您使用的是urllib2，那麼您需要查看this answer瞭解您自己的重定向處理程序的構建。

在短（讀：從以前的答案公然被盜）：

import urllib2 

#redirect_handler = urllib2.HTTPRedirectHandler() 

class MyHTTPRedirectHandler(urllib2.HTTPRedirectHandler): 
    def http_error_302(self, req, fp, code, msg, headers): 
     print "Cookie Manip Right Here" 
     return urllib2.HTTPRedirectHandler.http_error_302(self, req, fp, code, msg, headers) 

    http_error_301 = http_error_303 = http_error_307 = http_error_302 

cookieprocessor = urllib2.HTTPCookieProcessor() 

opener = urllib2.build_opener(MyHTTPRedirectHandler, cookieprocessor) 
urllib2.install_opener(opener) 

response =urllib2.urlopen("WHEREEVER") 
print response.read() 

print cookieprocessor.cookiejar

而且，在勘誤表中提到，您可以使用Python Requests。

來源

2012-04-01 19:43:07 MrGomez

我結束了使用這個重定向處理程序，根據你發現：http：/ /pastebin.com/m7aN21A7 謝謝！ – Krenair 2012-04-01 20:59:27

@Krenair很高興幫助！ – MrGomez 2012-04-01 21:02:57

您可以用requests庫做到這一點：

>>> import requests 
>>> r = requests.head('http://github.com', allow_redirects=True) 
>>> r 
<Response [200]> 
>>> r.history 
[<Response [301]>] 
>>> r.url 
u'https://github.com/'

來源

2012-04-01 19:43:35 jterrace

Python HTTP HEAD - 正確處理重定向？

回答

相關問題