2012-10-31 43 views
3

當我在cygwin中對Python 3.2.3包使用以下函數時,它會掛起對任何https主機的任何請求。它會拋出這個錯誤:[Errno 104] 60秒後由對等重置連接。HTTPS請求結果在Python中與Windows重置連接3

更新:我認爲它僅限於cygwin,但這也發生在Windows 7 64位與Python 3.3。我現在試試3.2。使用Windows命令外殼時的錯誤是: urlopen錯誤[WinError 10054]現有連接被遠程主機強制關閉

UPDATE2(Electric-Bugaloo):這僅限於我的幾個站點試圖使用。我測試了谷歌和其他主要網站沒有問題。看來它是與此相關的錯誤:

http://bugs.python.org/issue16361

具體的,服務器在客戶端問候後懸掛。這是由於python3.2和3.3的編譯版本附帶的openssl版本。它錯誤地識別服務器的ssl版本。現在我需要的代碼自動降級我的版本SSL來SSLV3在這個崗位開放給受影響的網站的連接時一樣:

How to use urllib2 to get a webpage using SSLv3 encryption

,但我無法得到它的工作。

def worker(url, body=None, bt=None): 
    '''This function does all the requests to wherever for data 

    takes in a url, optional body utf-8 encoded please, and optional body type''' 

    hdrs = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 
        'Accept-Language': 'en-us,en;q=0.5', 
        'Accept-Encoding': 'gzip,deflate', 
        'User-Agent': "My kewl Python tewl!"} 
    if 'myweirdurl' in url: 
     hdrs = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 
        'Accept-Language': 'en-us,en;q=0.5', 
        'Accept-Encoding': 'gzip,deflate', 
        'User-Agent': "Netscape 6.0"} 

    if bt: 
     hdrs['Content-Type'] = bt 
    urlopen = urllib.request.urlopen 
    Request = urllib.request.Request 
    start_req = time.time() 
    logger.debug('request start: {}'.format(datetime.now().ctime())) 
    if 'password' not in url: 
     logger.debug('request url: {}'.format(url)) 
    req = Request(url, data=body, headers=hdrs) 
    try: 
     if body: 
      logger.debug("body: {}".format(body)) 
      handle = urlopen(req, data=body, timeout=298) 
     else: 
      handle = urlopen(req, timeout=298) 
    except socket.error as se: 
     logger.error(se) 
     logger.error(se.errno) 
     logger.error(type(se)) 
     if hasattr(se, 'errno') == 60: 
      logger.error("returning: Request Timed Out") 
      return 'Request Timed Out' 
    except URLError as ue: 
     end_time = time.time() 
     logger.error(ue) 
     logger.error(hasattr(ue, 'code')) 
     logger.error(hasattr(ue, 'errno')) 
     logger.error(hasattr(ue, 'reason')) 
     if hasattr(ue, 'code'): 
      logger.warn('The server couldn\'t fulfill the request.') 
      logger.error('Error code: {}'.format(ue.code)) 
      if ue.code == 404: 
       return "Resource Not Found (404)" 
     elif hasattr(ue, 'reason') : 
      logger.warn('We failed to reach a server with {}'.format(url)) 
      logger.error('Reason: {}'.format(ue.reason)) 

      logger.error(type(ue.reason)) 
      logger.error(ue.reason.errno) 
      if ue.reason == 'Operation timed out': 
       logger.error("Arrggghh, timed out!") 
      else: 
       logger.error("Why U no match my reason?") 
       if ue.reason.errno == 60: 
        return "Operation timed out" 
     elif hasattr(ue, 'errno'): 
      logger.warn(ue.reason) 
      logger.error('Error code: {}'.format(ue.errno)) 
      if ue.errno == 60: 
       return "Operation timed out" 
     logger.error("req time: {}".format(end_time - start_req)) 
     logger.error("returning: Server Error") 
     return "Server Error" 
    else: 
     resp_headers = dict(handle.info()) 
     logger.debug('Here are the headers of the page : {}'.format(resp_headers)) 
     logger.debug("The true URL in case of redirects {}".format(handle.geturl()))   
     try: 
      ce = resp_headers['Content-Encoding'] 
     except KeyError as ke: 
      ce = None 
     else: 
      logger.debug('Content-Encoding: {}'.format(ce)) 
     try: 
      ct = resp_headers['Content-Type'] 
     except KeyError as ke: 
      ct = None    
     else: 
      logger.debug('Content-Type: {}'.format(ct)) 
     if ce == "gzip": 
      logger.debug("Unzipping payload") 
      bi = BytesIO(handle.read()) 
      gf = GzipFile(fileobj=bi, mode="rb") 
      if "charset=utf-8" in ct.lower() or ct == 'text/html' or ct == 'text/plain': 
       payload = gf.read().decode("utf-8") 
      else: 
       logger.debug("Unknown content type: {}".format(ct)) 
       sys.exit() 
      return payload 
     else: 
      if ct is not None and "charset=utf-8" in ct.lower() or ct == 'text/html' or ct == 'text/plain': 
       return handle.read().decode("utf-8") 
      else: 
       logger.debug("Unknown content type: {}".format(ct)) 
       sys.exit() 
+0

你有沒有用urllib2試過呢? – andrean

+0

Python 3.3.0(v3.3.0:bd8afb90ebf2,Sep 29 2012,10:55:48)[win32上的[MSC v.1600 32位(Intel)] 輸入「copyright」,「credits」或「license()」瞭解更多信息。 >>>進口的urllib2 回溯(最近通話最後一個): 文件 「」,1號線,在 進口的urllib2 導入錯誤:沒有模塊名爲 'urllib2的' – user352472

+0

對不起我的壞,我想在python2: )另一方面,嘗試將用戶代理字符串作爲真正的瀏覽器用戶代理字符串(http://www.useragentstring.com/)。也嘗試打開一些其他https網址,看看是否出現同樣的問題。 – andrean

回答

4

我想通了,這裏是必要的代碼塊將使其在Windows上工作:

'''had to add this windows specific block to handle this bug in urllib2: 
http://bugs.python.org/issue11220 
''' 
if "windows" in platform().lower(): 
    if 'my_wacky_url' or 'my_other_wacky_url' in url.lower(): 
     import ssl 
     ssl_context = urllib.request.HTTPSHandler(
               context=ssl.SSLContext(ssl.PROTOCOL_TLSv1)) 
     opener = urllib.request.build_opener(ssl_context) 
     urllib.request.install_opener(opener) 
#end of urllib workaround 

我加入這個BLOB的第一次嘗試前右:塊和它的工作就像一個魅力。感謝您的幫助和幫助!