2011-06-21 60 views
4

我的任務是創建一個腳本,登錄到企業門戶轉到特定頁面,下載頁面,將其與早期版本進行比較,然後通過電子郵件發送給某個人取決於所做的更改。最後的部分很容易,但它是給我最大麻煩的第一步。使用python與NTLM瀏覽受NTLM保護的網站NTLM

在使用urllib2(我試圖在python中執行此操作)未成功連接以及大約4或5小時的谷歌搜索之後,我確定我無法連接的原因是由於網頁上的NTLM身份驗證。我已經嘗試了一系列不同的流程用於在本網站和其他網站上發現的連接,但無濟於事。基於該NTLM example我做:

import urllib2 
from ntlm import HTTPNtlmAuthHandler 

user = 'username' 
password = "password" 
url = "https://portal.whatever.com/" 

passman = urllib2.HTTPPasswordMgrWithDefaultRealm() 
passman.add_password(None, url, user, password) 
# create the NTLM authentication handler 
auth_NTLM = HTTPNtlmAuthHandler.HTTPNtlmAuthHandler(passman) 

# create and install the opener 
opener = urllib2.build_opener(auth_NTLM) 
urllib2.install_opener(opener) 

# create a header 
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)' 
header = { 'Connection' : 'Keep-alive', 'User-Agent' : user_agent} 

response = urllib2.urlopen(urllib2.Request(url, None, header)) 

當我運行這個(與真實的用戶名,密碼和網址),我得到以下幾點:

Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "ntlm2.py", line 21, in <module> 
    response = urllib2.urlopen(urllib2.Request(url, None, header)) 
    File "C:\Python27\lib\urllib2.py", line 126, in urlopen 
    return _opener.open(url, data, timeout) 
    File "C:\Python27\lib\urllib2.py", line 400, in open 
    response = meth(req, response) 
    File "C:\Python27\lib\urllib2.py", line 513, in http_response 
    'http', request, response, code, msg, hdrs) 
    File "C:\Python27\lib\urllib2.py", line 432, in error 
    result = self._call_chain(*args) 
    File "C:\Python27\lib\urllib2.py", line 372, in _call_chain 
    result = func(*args) 
    File "C:\Python27\lib\urllib2.py", line 619, in http_error_302 
    return self.parent.open(new, timeout=req.timeout) 
    File "C:\Python27\lib\urllib2.py", line 400, in open 
    response = meth(req, response) 
    File "C:\Python27\lib\urllib2.py", line 513, in http_response 
    'http', request, response, code, msg, hdrs) 
    File "C:\Python27\lib\urllib2.py", line 432, in error 
    result = self._call_chain(*args) 
    File "C:\Python27\lib\urllib2.py", line 372, in _call_chain 
    result = func(*args) 
    File "C:\Python27\lib\urllib2.py", line 619, in http_error_302 
    return self.parent.open(new, timeout=req.timeout) 
    File "C:\Python27\lib\urllib2.py", line 400, in open 
    response = meth(req, response) 
    File "C:\Python27\lib\urllib2.py", line 513, in http_response 
    'http', request, response, code, msg, hdrs) 
    File "C:\Python27\lib\urllib2.py", line 438, in error 
    return self._call_chain(*args) 
    File "C:\Python27\lib\urllib2.py", line 372, in _call_chain 
    result = func(*args) 
    File "C:\Python27\lib\urllib2.py", line 521, in http_error_default 
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) 
    urllib2.HTTPError: HTTP Error 401: Unauthorized 

這是最有趣的關於此跟蹤的東西對我來說最後一行是說發回了401錯誤。從我有read 401錯誤是NTLM啓動時發送回客戶端的第一條消息。我的印象是,python-ntml的目的是爲我處理NTLM進程。那是錯的還是我錯誤地使用它?此外,我並沒有爲此使用python,所以如果有更簡單的方法來做到這一點在另一種語言讓我知道(從我看到的谷歌搜索沒有)。 謝謝!

+0

401送回發起NTLM /協商身份驗證的第一反應。但這也是您的身份驗證失敗後的最終迴應。你確定服務器支持NTLM身份驗證嗎?通常這是禁用的,並且只支持協商(又名SPNEGO aka Kerberos)身份驗證。 –

+0

所以它可能是一個不同的類型(Kerberos?)當我嘗試以不同的方式訪問它時,可以考慮它,它始終會在標題中的WWWWAuthenticate字段中說'Negotiate'。你知道是否有對Kerberos的支持嗎? – jias

+0

因此,身份驗證頭基本上只是GSSAPI調用的base64'd輸入和輸出值。像python-krb5 https://fedorahosted.org/python-krbV/可能會有所幫助。但是如果你還沒有在你的網站上做Kerberos,這可能是一個全新的蠕蟲。您可能希望嘗試確保IIS已啓用NTLM:http://support.microsoft.com/kb/215383 –

回答

1

如果該站點使用NTLM身份驗證,標題屬性的產生HTTPError應該這麼說:

>>> try: 
... handle = urllib2.urlopen(req) 
... except IOError, e: 
... print e.headers 
... 
<other headers> 
WWW-Authenticate: Negotiate 
WWW-Authenticate: NTLM