2017-05-26 15 views
0

我試圖利用HTTP流的Unicode字符的文件,但我發現了一個UnicodeEncodeErrorPython的請求不能流UTF-8編碼的文件

>>> requests.put(my_url, headers=my_headers, data=open('test.csv', 'r', encoding='utf-8')) 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File ".../python3.5/site-packages/requests/api.py", line 126, in put 
    return request('put', url, data=data, **kwargs) 
    File ".../python3.5/site-packages/requests/api.py", line 58, in request 
    return session.request(method=method, url=url, **kwargs) 
    File ".../python3.5/site-packages/requests/sessions.py", line 518, in request 
    resp = self.send(prep, **send_kwargs) 
    File ".../python3.5/site-packages/requests/sessions.py", line 639, in send 
    r = adapter.send(request, **kwargs) 
    File ".../python3.5/site-packages/requests/adapters.py", line 438, in send 
    timeout=timeout 
    File ".../python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 600, in urlopen 
    chunked=chunked) 
    File ".../python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 356, in _make_request 
    conn.request(method, url, **httplib_request_kw) 
    File ".../python3.5/http/client.py", line 1107, in request 
    self._send_request(method, url, body, headers) 
    File ".../python3.5/http/client.py", line 1152, in _send_request 
    self.endheaders(body) 
    File ".../python3.5/http/client.py", line 1103, in endheaders 
    self._send_output(message_body) 
    File ".../python3.5/http/client.py", line 936, in _send_output 
    self.send(message_body) 
    File ".../python3.5/http/client.py", line 904, in send 
    datablock = datablock.encode("iso-8859-1") 
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2122' in position 6375: ordinal not in range(256) 

我得到的錯誤不管我是否包括encoding='utf-8'。我如何以不需要將整個文件加載到內存中但仍能解決unicode編碼問題的方式發送此文件?

回答

0

至少在我的情況下,所有我需要做的就是以二進制方式打開文件:

>>> requests.put(my_url, headers=my_headers, data=open('test.csv', 'rb')) 

通過以二進制模式打開文件,python沒有嘗試對文件進行編碼,而是直接將其傳遞給url。

0

open(..., encoding="utf-8")不編碼文件內容完全相反 - 用它你告訴open()到您的文件內容進行解碼成正unicode字符串,不能被無損編碼成請求所需latin-1(是的,HTTP像那樣古老)如果它有'特殊'字符。您需要在發送內容之前編碼您的內容。試着用:

requests.put(my_url, headers=my_headers, data=open("test.csv", "r", encoding="utf-8").read().encode("utf-8")) 

芹苴,這是一個非常糟糕的形式來處理文件內容...