我目前正在用Scrapy處理一個問題。每當我使用Scrapy刮取證書的CN值與服務器的域名相匹配的HTTPS站點時,Scrapy效果很好!在另一方面,雖然,每當我試圖刮一個網站,該證書的CN值不匹配服務器的域名,我得到如下:在Scrapy中禁用SSL證書驗證
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/twisted/protocols/tls.py", line 415, in dataReceived
self._write(bytes)
File "/usr/local/lib/python2.7/dist-packages/twisted/protocols/tls.py", line 554, in _write
sent = self._tlsConnection.send(toSend)
File "/usr/local/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 1270, in send
result = _lib.SSL_write(self._ssl, buf, len(buf))
File "/usr/local/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 926, in wrapper
callback(Connection._reverse_mapping[ssl], where, return_code)
--- <exception caught here> ---
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/_sslverify.py", line 1055, in infoCallback
return wrapped(connection, where, ret)
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/_sslverify.py", line 1154, in _identityVerifyingInfoCallback
verifyHostname(connection, self._hostnameASCII)
File "/usr/local/lib/python2.7/dist-packages/service_identity/pyopenssl.py", line 30, in verify_hostname
obligatory_ids=[DNS_ID(hostname)],
File "/usr/local/lib/python2.7/dist-packages/service_identity/_common.py", line 235, in __init__
raise ValueError("Invalid DNS-ID.")
exceptions.ValueError: Invalid DNS-ID.
我已經通過儘可能多的資料看,我可以和據我所知,Scrapy沒有辦法禁用SSL證書驗證。即使對於Scrapy Request對象(我會以爲是哪裏此功能會說謊)的文件有沒有參考:
http://doc.scrapy.org/en/1.0/topics/request-response.html#scrapy.http.Request https://github.com/scrapy/scrapy/blob/master/scrapy/http/request/init.py
也有其解決問題沒有Scrapy設置:
http://doc.scrapy.org/en/1.0/topics/settings.html
根據需要使用Scrapy並根據需要修改源的缺點,有沒有人有任何想法可以禁用SSL證書驗證?
謝謝!
從文檔中查看我可以修改「DOWNLOAD_HANDLERS」或「DOWNLOAD_HANDLERS_BASE」設置以更改scrapy處理https的方式。從那裏你可能不得不創建你自己修改的'HttpDownloadHandler',它可以通過你收到的錯誤。 – Monkpit
/我在桌子上胡思亂想。這當然看起來很有希望。你可以把它寫成答案,以便我可以接受,然後添加我用於其他人的代碼以供將來參考? – MoarCodePlz