BeautifulSoup UnicodeEncodeError

我試圖解析HTML page是我救了我的電腦（Windows 10）BeautifulSoup UnicodeEncodeError

from bs4 import BeautifulSoup 

with open("res/JLPT N5 vocab list.html", "r", encoding="utf8") as f: 
    soup = BeautifulSoup(f, "html.parser") 
tables = soup.find_all("table") 
sectable= tables[1] 
for tr in sectable.contents[1:]: 
    if tr.name == "tr": 
     try: 
      print(tr.td.a.get_text()) 
     except(AttributeError): 
      continue

應該打印所有的日語單詞的第一列，但在print(tr.td.a.get_text())有人提出錯誤說UnicodeEncodeError: 'charmap" codec can't encode character in position 0-1: character maps to (undefined)那麼，如何我能解決這個錯誤嗎？

來源

2016-03-16 witoong623

最後，我解決了這個問題，根據Beautiful Soup Documentatioin's Miscellaneous.

UnicodeEncodeError: 'charmap' codec can't encode character u'\xfoo' in position bar (or just about any other UnicodeEncodeError) - This is not a problem with Beautiful Soup. This problem shows up in two main situations. First, when you try to print a Unicode character that your console doesn’t know how to display. (See this page on the Python wiki for help.) Second, when you’re writing to a file and you pass in a Unicode character that’s not supported by your default encoding. In this case, the simplest solution is to explicitly encode the Unicode string into UTF-8 with u.encode("utf8").

在我的情況，是因爲我想打印一個Unicode字符，我的控制檯不知道如何來顯示它。
所以，我enabled TrueType font for console，改變系統區域設置爲日語（使控制檯編碼被改變，可以選擇支持日本的控制檯字體），然後改變控制檯字體到MSコシック（這種字體出現後，我改變了系統區域設置）。
如果我想將其寫入文件，我剛打開的文件，並指定編碼成UTF-8。

來源

2016-03-16 08:37:24 witoong623

BeautifulSoup UnicodeEncodeError

回答

相關問題