2013-04-28 11 views
1

我正在使用python-twiter來搜索使用Twitter API的推文,並且我遇到了中文問題。下面是一個最小的代碼示例來重現問題:正在搜索中文文本拋出UnicodeEncodeError

# -*- coding: utf-8 -*- 
import twitter 

api = twitter.Api(consumer_key = "...", consumer_secret = "...", 
        access_token_key = "...", access_token_secret = "...") 

api.VerifyCredentials() 
print u"您說英語嗎" 
r = api.GetSearch(term=u"您說英語嗎") 

我得到這個錯誤:

您說英語嗎 
Traceback (most recent call last): 
      File "so.py", line 9, in <module> 
    r = api.GetSearch(term=u"您說英語嗎") 
    File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/python_twitter-0.8.7-py2.7.egg/twitter.py", line 2419, in GetSearch 
    json = self._FetchUrl(url, parameters=parameters) 
    File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/python_twitter-0.8.7-py2.7.egg/twitter.py", line 4041, in _FetchUrl 
    url = req.to_url() 
    File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/oauth2-1.5.211-py2.7.egg/oauth2/__init__.py", line 440, in to_url 
    urllib.urlencode(query, True), fragment) 
    File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 1337, in urlencode 
    l.append(k + '=' + quote_plus(str(elt))) 
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128) 
+0

之前添加以下代碼,你試過編碼呢? 'u「您說英語嗎」.encode('utf8')' – juliomalegria 2013-04-28 22:51:43

回答

2

好像有在GetSearch了一個錯誤:https://code.google.com/p/python-twitter/issues/detail?id=210。我試圖用俄語搜索「普京」(「Путин」),也得到了同樣的錯誤。玩編碼沒有幫助。

作爲一種變通方法,您可以使用twitter包(https://github.com/sixohsix/twitter):

# -*- coding: utf-8 -*- 
from twitter import * 

t = Twitter(auth=OAuth(token="...", token_secret="...", consumer_key="...", consumer_secret="..."))) 

print t.search.tweets(q=u"您說英語嗎") 
+0

非常感謝。代碼中的「OAuth」是什麼,我從哪裏得到它? – piokuc 2013-04-28 23:27:26

+0

是的,它工作正常,再次感謝,我沒有從twitter模塊導入OAuth。順便說一句,不幸的是,兩個不同的模塊具有相同的名稱... – piokuc 2013-04-28 23:37:18

0

而且,嘗試使用非英語文本

import sys

reload(sys)

sys.setdefaultencoding("utf-8")