2012-04-12 117 views
21

我在我的應用程序的一種方法中使用Python的requests庫。該方法的身體看起來是這樣的:Python請求從本地url獲取文件

def handle_remote_file(url, **kwargs): 
    response = requests.get(url, ...) 
    buff = StringIO.StringIO() 
    buff.write(response.content) 
    ... 
    return True 

我想編寫該方法的一些單元測試,但是,我想要做的是通過一個虛擬的本地URL,例如:

class RemoteTest(TestCase): 
    def setUp(self): 
     self.url = 'file:///tmp/dummy.txt' 

    def test_handle_remote_file(self): 
     self.assertTrue(handle_remote_file(self.url)) 

當我打電話requests.get與本地URL,我得到了KeyError異常下面例外:

requests.get('file:///tmp/dummy.txt') 

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/packages/urllib3/poolmanager.pyc in connection_from_host(self, host, port, scheme) 
76 
77   # Make a fresh ConnectionPool of the desired type 
78   pool_cls = pool_classes_by_scheme[scheme] 
79   pool = pool_cls(host, port, **self.connection_pool_kw) 
80 

KeyError: 'file' 

問題是我怎麼能通過一個本地的url到requests.get? PS:我編了上面的例子。它可能包含很多錯誤。

+0

燦你使用本地純python web服務器? – zealotous 2012-04-12 13:39:46

+0

是的。我在代碼中使用SimpleHTTPServer庫在一個新的線程中設置了一個本地web服務器,並使用它來處理遠程文件,然後按預期工作。 – ozgur 2012-04-20 10:44:45

回答

20

由於@WooParadog解釋請求庫不知道如何處理本地文件。儘管目前的版本允許定義transport adapters

因此你可以簡單地定義你自己的適配器,這將能夠處理本地文件,例如:

from requests_testadapter import Resp 

class LocalFileAdapter(requests.adapters.HTTPAdapter): 
    def build_response_from_file(self, request): 
     file_path = request.url[7:] 
     with open(file_path, 'rb') as file: 
      buff = bytearray(os.path.getsize(file_path)) 
      file.readinto(buff) 
      resp = Resp(buff) 
      r = self.build_response(request, resp) 

      return r 

    def send(self, request, stream=False, timeout=None, 
      verify=True, cert=None, proxies=None): 

     return self.build_response_from_file(request) 

requests_session = requests.session() 
requests_session.mount('file://', LocalFileAdapter()) 
requests_session.get('file://<some_local_path>') 

我使用requests-testadapter模塊在上面的例子。

4

在最近的一個項目中,我遇到了同樣的問題。由於請求不支持「文件」方案,我將修補我們的代碼以在本地加載內容。首先,我定義一個函數來代替requests.get

def local_get(self, url): 
    "Fetch a stream from local files." 
    p_url = six.moves.urllib.parse.urlparse(url) 
    if p_url.scheme != 'file': 
     raise ValueError("Expected file scheme") 

    filename = six.moves.urllib.request.url2pathname(p_url.path) 
    return open(filename, 'rb') 

然後,在測試設置的某個地方或裝飾的測試功能,我用mock.patch修補get函數的請求:

@mock.patch('requests.get', local_get) 
def test_handle_remote_file(self): 
    ... 

這種技術有點脆弱 - 如果底層代碼調用requests.request或構建Session並調用它,則無濟於事。可能有辦法在較低級別修補請求以支持file:網址,但在我最初的調查中,似乎沒有明顯的掛鉤點,所以我採用了這種更簡單的方法。

10

下面是我寫的一個傳輸適配器,它比b1r3k的功能更強大,除Requests本身之外沒有額外的依賴關係。我還沒有詳盡地測試過,但我所嘗試過的似乎沒有缺陷。

import requests 
import os, sys 

if sys.version_info.major < 3: 
    from urllib import url2pathname 
else: 
    from urllib.request import url2pathname 

class LocalFileAdapter(requests.adapters.BaseAdapter): 
    """Protocol Adapter to allow Requests to GET file:// URLs 

    @todo: Properly handle non-empty hostname portions. 
    """ 

    @staticmethod 
    def _chkpath(method, path): 
     """Return an HTTP status for the given filesystem path.""" 
     if method.lower() in ('put', 'delete'): 
      return 501, "Not Implemented" # TODO 
     elif method.lower() not in ('get', 'head'): 
      return 405, "Method Not Allowed" 
     elif os.path.isdir(path): 
      return 400, "Path Not A File" 
     elif not os.path.isfile(path): 
      return 404, "File Not Found" 
     elif not os.access(path, os.R_OK): 
      return 403, "Access Denied" 
     else: 
      return 200, "OK" 

    def send(self, req, **kwargs): # pylint: disable=unused-argument 
     """Return the file specified by the given request 

     @type req: C{PreparedRequest} 
     @todo: Should I bother filling `response.headers` and processing 
       If-Modified-Since and friends using `os.stat`? 
     """ 
     path = os.path.normcase(os.path.normpath(url2pathname(req.path_url))) 
     response = requests.Response() 

     response.status_code, response.reason = self._chkpath(req.method, path) 
     if response.status_code == 200 and req.method.lower() != 'head': 
      try: 
       response.raw = open(path, 'rb') 
      except (OSError, IOError) as err: 
       response.status_code = 500 
       response.reason = str(err) 

     if isinstance(req.url, bytes): 
      response.url = req.url.decode('utf-8') 
     else: 
      response.url = req.url 

     response.request = req 
     response.connection = self 

     return response 

    def close(self): 
     pass 

(儘管名字,它完全之前,我想檢查谷歌寫的,所以它無關b1r3k的),至於其他的答案,按照這個具有:

requests_session = requests.session() 
requests_session.mount('file://', LocalFileAdapter()) 
r = requests_session.get('file:///path/to/your/file') 
+0

tx。除了(OSError,IOError),err:之外的東西是錯誤的。我的替換是除了(OSError,IOError)作爲錯誤: – 2017-08-08 07:57:26

+0

@LennartRolland在我發表帖子時,我只使用了Python 2.x中的Requests。只要我可以騰出幾分鐘時間來測試更改,我會立即糾正我的帖子。 – ssokolow 2017-08-08 09:10:56

+0

好的工作。但它不適用於像「../ foo.bar」這樣的本地URL。然而改變send方法很簡單,所以它不使用'req.path_url()',而是使用剝離'file://'的東西並保留其餘部分。 – rocky 2017-09-11 23:47:45