2016-08-21 36 views
1

我正在使用代理服務(proxymesh),它將有用的信息放入爲響應CONNECT請求而發送的標頭中。無論出於何種原因,Python's httplib doesn't parse them正在讀取CONNECT標頭

> CONNECT example.com:443 HTTP/1.1 
> Host: example.com:443 
> 
< HTTP/1.1 200 Connection established 
< X-Useful-Header: value # completely ignored 
< 

requests模塊使用httplib內部,因此它會忽略他們。如何從CONNECT請求中提取標題?

+0

這張照片有兩件事是錯誤的。首先,對CONNECT的響應有標題是完全正確的。其次,你不應該有任何負載,因爲接下來是客戶端發起的TLS握手。 – Adrien

+0

@Adrien:正如我所說的,'httplib'丟棄'CONNECT'後發送的頭文件。這不應該,但它確實。至於有效載荷部分,我同意,這是錯誤的。我已將其刪除。 – Blender

回答

1

Python的httplib創建隧道時實際上忽略了這些標頭。這是哈克,但你可以攔截下來併合並與實際HTTP響應的頭部的「頭」行:

import socket 
import httplib 
import requests 

from requests.packages.urllib3.connection import HTTPSConnection 
from requests.packages.urllib3.connectionpool import HTTPSConnectionPool 
from requests.packages.urllib3.poolmanager import ProxyManager 

from requests.adapters import HTTPAdapter 


class ProxyHeaderHTTPSConnection(HTTPSConnection): 
    def __init__(self, *args, **kwargs): 
     super(ProxyHeaderHTTPSConnection, self).__init__(*args, **kwargs) 
     self._proxy_headers = [] 

    def _tunnel(self): 
     self.send("CONNECT %s:%d HTTP/1.0\r\n" % (self._tunnel_host, self._tunnel_port)) 

     for header, value in self._tunnel_headers.iteritems(): 
      self.send("%s: %s\r\n" % (header, value)) 

     self.send("\r\n") 

     response = self.response_class(self.sock, strict=self.strict, method=self._method) 
     version, code, message = response._read_status() 

     if version == "HTTP/0.9": 
      # HTTP/0.9 doesn't support the CONNECT verb, so if httplib has 
      # concluded HTTP/0.9 is being used something has gone wrong. 
      self.close() 
      raise socket.error("Invalid response from tunnel request") 

     if code != 200: 
      self.close() 
      raise socket.error("Tunnel connection failed: %d %s" % (code, message.strip())) 

     self._proxy_headers = [] 

     while True: 
      line = response.fp.readline(httplib._MAXLINE + 1) 

      if len(line) > httplib._MAXLINE: 
       raise LineTooLong("header line") 

      if not line or line == '\r\n': 
       break 

      # The line is a header, save it 
      if ':' in line: 
       self._proxy_headers.append(line) 

    def getresponse(self, buffering=False): 
     response = super(ProxyHeaderHTTPSConnection, self).getresponse(buffering) 
     response.msg.headers.extend(self._proxy_headers) 

     return response 


class ProxyHeaderHTTPSConnectionPool(HTTPSConnectionPool): 
    ConnectionCls = ProxyHeaderHTTPSConnection 


class ProxyHeaderProxyManager(ProxyManager): 
    def _new_pool(self, scheme, host, port): 
     assert scheme == 'https' 

     return ProxyHeaderHTTPSConnectionPool(host, port, **self.connection_pool_kw) 


class ProxyHeaderHTTPAdapter(HTTPAdapter): 
    def proxy_manager_for(self, proxy, **proxy_kwargs): 
     if proxy in self.proxy_manager: 
      manager = self.proxy_manager[proxy] 
     else: 
      proxy_headers = self.proxy_headers(proxy) 
      manager = self.proxy_manager[proxy] = ProxyHeaderProxyManager(
       proxy_url=proxy, 
       proxy_headers=proxy_headers, 
       num_pools=self._pool_connections, 
       maxsize=self._pool_maxsize, 
       block=self._pool_block, 
       **proxy_kwargs) 

     return manager 

然後,您可以將適配器安裝到會話:

session = requests.Session() 
session.mount('https://', ProxyHeaderHTTPAdapter()) 

response = session.get('https://example.com', proxies={...}) 

代理的標題將被合併到響應頭中,所以它應該像代理直接修改響應頭那樣工作。