UnicodeEncodeError：'ascii'編解碼器無法編碼字符u'\ u2730'在位置1：序號不在範圍內（128）

任何想法如何解決這個問題？UnicodeEncodeError：'ascii'編解碼器無法編碼字符u' u2730'在位置1：序號不在範圍內（128）

import csv 
import re 
import time 
import urllib2 
from urlparse import urljoin 
from bs4 import BeautifulSoup 

BASE_URL = 'http://omaha.craigslist.org/sys/' 
URL = 'http://omaha.craigslist.org/sya/' 
FILENAME = '/Users/mona/python/craigstvs.txt' 

opener = urllib2.build_opener() 
opener.addheaders = [('User-agent', 'Mozilla/5.0')] 
soup = BeautifulSoup(opener.open(URL)) 

with open(FILENAME, 'a') as f: 
    writer = csv.writer(f, delimiter=';') 
    for link in soup.find_all('a', class_=re.compile("hdrlnk")): 
     timeset = time.strftime("%m-%d %H:%M") 

     item_url = urljoin(BASE_URL, link['href']) 
     item_soup = BeautifulSoup(opener.open(item_url)) 

     # do smth with the item_soup? or why did you need to follow this link? 

     writer.writerow([timeset, link.text, item_url])

來源

2014-09-06 Mona Jalal

作爲一個經驗，我不得不說，CSV模塊不支持Unicode完全，但你會發現這種方式非常有用

import codecs 
... 
codecs.open('file.csv', 'r', 'UTF-8')

打開文件，或者可能要自己處理，而不是使用CSV模塊

來源

2014-09-06 10:18:19 mehdy

你只需要encode文本：

link.text.encode("utf-8")

也可以使用requests代替urllib2：

import requests 
BASE_URL = 'http://omaha.craigslist.org/sys/' 
URL = 'http://omaha.craigslist.org/sya/' 
FILENAME = 'craigstvs.txt' 
soup = BeautifulSoup(requests.get(URL).content) 
with open(FILENAME, 'a') as f: 
    writer = csv.writer(f, delimiter=';') 
    for link in soup.find_all('a', class_=re.compile("hdrlnk")): 
     timeset = time.strftime("%m-%d %H:%M") 
     item_url = urljoin(BASE_URL, link['href']) 
     item_soup = BeautifulSoup(requests.get(item_url).content) 
     # do smth with the item_soup? or why did you need to follow this link? 
     writer.writerow([timeset, link.text.encode("utf-8"), item_url])

來源

2014-09-06 10:30:13

UnicodeEncodeError：'ascii'編解碼器無法編碼字符u'\ u2730'在位置1：序號不在範圍內（128）

回答

相關問題