Python的請求編碼POST數據

版本：2.7.3的PythonPython的請求編碼POST數據

其他庫：Python的請求1.2.3，使用Jinja2（2.6）

我有數據提交給論壇的腳本，問題是非ascii字符顯示爲垃圾。例如AndréTéchiné這樣的名字以Andrétéchiné的名字出現。

這裏的數據是如何提交：

1）數據最初從一個UTF-8編碼的CSV文件加載像這樣：

entries = [] 
with codecs.open(filename, 'r', 'utf-8') as f: 
    for row in unicode_csv_reader(f.readlines()[1:]): 
     entries.append(dict(zip(csv_header, row)))

unicode_csv_reader是從Python的CSV文檔頁面的底部： http://docs.python.org/2/library/csv.html

當我在解釋器中鍵入條目名稱時，我看到名稱爲u'Andr\xe9 T\xe9chin\xe9'。

2）接下來，我通過Jinja2的呈現數據：

tpl = tpl_env.get_template(u'forumpost.html') 
rendered = tpl.render(entries=entries)

當我鍵入解釋作出的名字，我再次看到同樣的：現在u'Andr\xe9 T\xe9chin\xe9'

，如果我寫的渲染變量像這樣的文件名，它會顯示正確：

with codecs.open('out.txt', 'a', 'utf-8') as f: 
    f.write(rendered)

但我必須把它發送到論壇：

3）在POST請求的代碼，我有：

params = {u'post': rendered} 
headers = {u'content-type': u'application/x-www-form-urlencoded'} 
session.post(posturl, data=params, headers=headers, cookies=session.cookies)

會話是請求會話。

而名稱顯示在論壇帖子中顯示。我曾嘗試以下：

漏下頭
編碼呈現爲rendered.encode（ 'UTF-8'）（相同的結果）
渲染= urllib.quote_plus（渲染）（出來作爲所有％XY）

如果鍵入rendered.encode（ 'UTF-8'），我看到以下內容：

'Andr\xc3\xa9 T\xc3\xa9chin\xc3\xa9'

我該如何解決這個問題？謝謝。

來源

2013-07-02 TheMagician

您的客戶端的行爲，因爲它應該如運行nc -l 8888作爲服務器發出請求：

import requests 

requests.post('http://localhost:8888', data={u'post': u'Andr\xe9 T\xe9chin\xe9'})

顯示：

POST/HTTP/1.1 
Host: localhost:8888 
Content-Length: 33 
Content-Type: application/x-www-form-urlencoded 
Accept-Encoding: gzip, deflate, compress 
Accept: */* 
User-Agent: python-requests/1.2.3 CPython/2.7.3 

post=Andr%C3%A9+T%C3%A9chin%C3%A9

您可以檢查，這是正確的：

>>> import urllib 
>>> urllib.unquote_plus(b"Andr%C3%A9+T%C3%A9chin%C3%A9").decode('utf-8') 
u'Andr\xe9 T\xe9chin\xe9'

檢查服務器的請求進行解碼正確。你可以試着指定字符集：
```
headers = {"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8"} 
```
身體只包含ASCII字符因此它不應該傷害和正確的服務器會忽略x-www-form-urlencoded類型的任何參數的反正。尋找血淋淋的細節在URL-encoded form data
檢查問題是不是顯示器假象，即，該值是正確的，但它顯示不正確

來源

2013-07-02 06:50:49 jfs

「檢查問題是不是顯示器假象，即，該值是正確的但它顯示不正確「 - 謝謝。那就是問題所在！不幸的是，這是一個公共論壇，我無法更改默認編碼。它用iso-8859-1編碼進行響應。我可以使用rendered.encode（'iso-8859-1'）還是會破壞事物？謝謝。 – TheMagician

嘗試在標頭中設置字符集 – jfs

這沒有奏效。 – TheMagician

嘗試解碼成UTF8：

unicode(my_string_variable, "utf8")

或解碼和編碼：

sometext = gettextfromsomewhere().decode('utf-8') 
env = jinja2.Environment(loader=jinja2.PackageLoader('jinjaapplication', 'templates')) 
template = env.get_template('mypage.html') 
print template.render(sometext = sometext).encode('utf-8')

來源

2013-07-02 06:11:12 dikkini

Python的請求編碼POST數據

回答

相關問題