檢查python中是否存在https網頁

在python 2.x腳本中，我正在尋找功能來檢查https頁面是否返回特定內容（可能需要解析頁面內容才能發現此內容）。該頁面也有一個htpasswd提示符，該提示符需要使用用戶名和密碼進行驗證才能看到內容。所以我想我正在尋找一個模塊或其他功能，它能夠讓我對用戶名和密碼進行硬編碼，以便它可以獲取頁面，並且我可以操作輸出（也就是檢查是否存在代表404頁面的關鍵字的等同物）。檢查python中是否存在https網頁

我在看http://docs.python.org/2/library/httplib.html，但它似乎沒有做我在找什麼。

來源

2014-02-23 Peter

你可能與httplib模塊做，但有更簡單的方法，不需要手動驅動的HTTP協議。

（第一外部模塊需要安裝）使用requests library可能是最簡單的：

import requests 

auth = ('someusername', 'somepassword') 
response = requests.get(yoururl, auth=auth) 
response.raise_for_status()

這將引發一個異常，如果響應不成功或返回404未找到。

然後，您可以使用response.content（字節字符串）或response.text（unicode響應）進一步解析響應正文。

使用只是標準庫，使用urllib2 module會是什麼樣子：

import urllib2, base64 

request = urllib2.Request(yoururl) 
authstring = base64.encodestring('{}:{}'.format('someusername', 'somepassword')).strip() 
request.add_header("Authorization", "Basic {}".format(authstring)) 
response = urllib2.urlopen(request) 

if not 200 <= response.getcode() < 400: 
    # error response, raise an exception here? 

content = response.read() 
try: 
    text = content.decode(response.info().getparam('charset', 'utf8')) 
except UnicodeDecodeError: 
    text = content.decode('ascii', 'replace')

其中content是響應主體的字節串內容，text將是Unicode值，在一定程度上。

來源

2014-02-23 12:51:46

非常好。 '請求'似乎工作得很好。 \ O / – Peter

檢查python中是否存在https網頁

回答

相關問題