2017-06-17 40 views
-1

所以我試圖從本地網站上取消有關主板的數據。如何修復UnicodeEncodeError:?

import bs4 
import os 
import requests 

from bs4 import BeautifulSoup as soup 

os.chdir('E://') 
os.makedirs('E://scrappy', exist_ok=True) 
myurl = "https://www.example.com" 
res = requests.get(myurl) 
page = soup(res.content, 'html.parser') 
containers = page.findAll("div", {"class": "content-product"}) 
filename = 'AM4.csv' 
f = open(filename, 'w') 
headers = 'Motherboard_Name, Price\n' 
f.write(headers) 

for container in containers: 
    Product = container.findAll("div", {"class": "product-title"}) 
    Motherboard_Name = Product[0].text.strip() 
    Kimat = container.findAll("span", {"class": "price"}) 
    Price = Kimat[0].text 
    print('Motherboard_Name' + Motherboard_Name) 
    print('Price' + Price) 
    f.write(Motherboard_Name + "," + Price.replace(",", "") + "\n") 
f.close() print("done") 

但是當我運行這段代碼我得到一個錯誤

UnicodeEncodeError:「字符映射」編解碼器不能編碼字符「\ u20b9」在第45位:字符映射到

我怎麼能解決這個問題??

編輯::所以我通過添加encoding =「utf-8」(因爲它在這裏提到python 3.2 UnicodeEncodeError: 'charmap' codec can't encode character '\u2013' in position 9629: character maps to <undefined>)(打開(文件名,'w',encoding =「utf-8」))並且它修復了unicode錯誤似乎做了這項工作,但在csv文件m獲得像(¹)之前的價格字符..我怎樣才能解決這個問題?

screenshot of the csv file

+0

如果你在你的腳本開始添加:#在/ usr /斌/包膜蟒蛇 # - * - 編碼:UTF-8 - * - – Costis94

+0

你在哪一行得到它? – Jeril

+0

@ Costis94「line 32」 文件「E:\ scrappy \ motherboard.py」,第32行,在 f.write(Motherboard_Name +「,」+ Price.replace(「,」,「」)+「\ n「) – user2996348

回答

0

使用csv模塊來管理CSV文件,並使用utf-8-sig爲Excel識別UTF-8正確。當打開文件時,請確保每csv文檔使用newline=''

例子:

import csv 

filename = 'AM4.csv' 
with open(filename,'w',newline='',encoding='utf-8-sig') as f: 
    w = csv.writer(f) 
    w.writerow(['Motherboard_Name','Price']) 
    name = 'some name' 
    price = '\u20b95,99' 
    w.writerow([name,price.replace(',','')]) 

Excel image

+0

非常感謝你的工作。 – user2996348