如何解碼ascii從流分析

我想從twitter api通過從textblob庫的情感分析運行文本，當我運行我的代碼時，代碼打印一個或兩個情緒值，然後出錯，出現以下錯誤：如何解碼ascii從流分析

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 31: ordinal not in range(128)

我不明白爲什麼這是代碼來處理，如果它只是分析文本的問題。我試圖將腳本編碼爲UTF-8。這裏是代碼：

from tweepy.streaming import StreamListener 
from tweepy import OAuthHandler 
from tweepy import Stream 
import json 
import sys 
import csv 
from textblob import TextBlob 

# Variables that contains the user credentials to access Twitter API 
access_token = "" 
access_token_secret = "" 
consumer_key = "" 
consumer_secret = "" 


# This is a basic listener that just prints received tweets to stdout. 
class StdOutListener(StreamListener): 
    def on_data(self, data): 
     json_load = json.loads(data) 
     texts = json_load['text'] 
     coded = texts.encode('utf-8') 
     s = str(coded) 
     content = s.decode('utf-8') 
     #print(s[2:-1]) 
     wiki = TextBlob(s[2:-1]) 

     r = wiki.sentiment.polarity 

     print r 

     return True 

    def on_error(self, status): 
     print(status) 

auth = OAuthHandler(consumer_key, consumer_secret) 
auth.set_access_token(access_token, access_token_secret) 
stream = Stream(auth, StdOutListener()) 

# This line filter Twitter Streams to capture data by the keywords: 'python', 'javascript', 'ruby' 
stream.filter(track=['dollar', 'euro' ], languages=['en'])

有人可以幫我這個問題嗎？

預先感謝您。

來源

2015-06-19 RustyShackleford

你在混合太多東西在一起。正如錯誤所說，你正試圖解碼一個字節類型。

json.loads將導致數據爲字符串，您需要對其進行編碼。

texts = json_load['text'] # string 
coded = texts.encode('utf-8') # byte 
print(coded[2:-1])

所以，在你的腳本，當你試圖解碼coded你得到一個錯誤約解碼byte數據。

來源

2015-06-19 22:41:58 Leb

再次感謝Leb ...你真棒！ – RustyShackleford

如何解碼ascii從流分析

回答

相關問題