獲得編碼錯誤，同時將數據寫入csv文件

from tweetpy import * 
import re 
import json 
from pprint import pprint 
import csv 

# Import the necessary methods from "twitter" library 
from twitter import Twitter, OAuth, TwitterHTTPError, TwitterStream 

# Variables that contains the user credentials to access Twitter API 
ACCESS_TOKEN = '' 
ACCESS_SECRET = '' 
CONSUMER_KEY = '' 
CONSUMER_SECRET = '' 

oauth = OAuth(ACCESS_TOKEN, ACCESS_SECRET, CONSUMER_KEY, CONSUMER_SECRET) 

# Initiate the connection to Twitter Streaming API 
twitter_stream = TwitterStream(auth=oauth) 

# Get a sample of the public data following through Twitter 
iterator = twitter_stream.statuses.filter(track="#kindle",language="en",replies="all") 
# Print each tweet in the stream to the screen 

# Here we set it to stop after getting 10000000 tweets. 
# You don't have to set it to stop, but can continue running 
# the Twitter API to collect data for days or even longer. 

tweet_count = 10000000 

file = "C:\\Users\\WELCOME\\Desktop\\twitterfeeds.csv" 
with open(file,"w") as csvfile: 
    fieldnames=['Username','Tweet','Timezone','Timestamp','Location'] 
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames) 
    writer.writeheader() 
    for tweet in iterator: 
     #pprint(tweet) 
     username = str(tweet['user']['screen_name']) 
     tweet_text = str(tweet['text']) 
     user_timezone = str(tweet['user']['time_zone']) 
     tweet_timestamp=str(tweet['created_at']) 
     user_location = str(tweet['user']['location']) 
     print tweet 
     tweet_count -= 1 
     writer.writerow({'Username':username,'Tweet':tweet_text,'Timezone':user_timezone,'Location':user_location,'Timestamp':tweet_timestamp}) 

     if tweet_count <= 0: 
      break

我想寫鳴叫與列'username'，'Tweet'，'Timezone'，'Location'，並且'Timestamp' csv文件。獲得編碼錯誤，同時將數據寫入csv文件

但我收到以下錯誤：

tweet_text = str(tweet['text']) 
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' in position 139: ordinal not in range(128).

我知道它是編碼的問題，但我不知道該變量編碼的確切位置。

來源

2017-06-04 user7699179

你想對違規字符做什麼？忽略它們？將它們轉換爲最接近的ASCII等價物？轉換爲固定字符，例如問號？ –

對Python 2和Python 3來說，答案可能會有所不同。無論如何，你並沒有正確打開csv文件。建議您閱讀顯示如何正確顯示的文檔（在兩個版本中）。 – martineau

使用Python 3，因爲Python的2 csv模塊沒有做編碼很好。
使用open與encoding和newline選項。
刪除str轉換（在Python 3 str是Unicode字符串已經

結果：。

with open(file,"w",encoding='utf8',newline='') as csvfile: 
    fieldnames=['Username','Tweet','Timezone','Timestamp','Location'] 
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames) 
    writer.writeheader() 
    for tweet in iterator: 
     username = tweet['user']['screen_name'] 
     tweet_text = tweet['text'] 
     user_timezone = tweet['user']['time_zone'] 
     tweet_timestamp = tweet['created_at'] 
     user_location = tweet['user']['location'] 
      . 
      . 
      .

如果使用Python 2中，得到第三方unicodecsv模塊克服csv缺點

來源

2017-06-04 19:03:29

如果你真的想改變你的所有Unicode數據

tweet['text'].encode("ascii", "replace") 
or 
tweet['text'].encode("ascii", "ignore") # if you want skip char

來源

2017-06-04 16:27:14 Ptank

獲得編碼錯誤，同時將數據寫入csv文件

回答

相關問題