的Python：讀UTF - 從的raw_input 8（）寫作UTF - 在文件8

-1

所以，我想提出一個程序做兩件事情：的Python：讀UTF - 從的raw_input 8（）寫作UTF - 在文件8

讀取字
讀取翻譯希臘

然後，我製作了一個如下所示的新格式："word,translation"並且我正在將它寫入文件。

所以test.txt文件應該包含"Hello,Γεια"，如果我再讀一遍，下一行應該放在這個下面。

word=raw_input("Word:\n") #The Word 
translation=raw_input("Translation:\n").decode("utf-8") #The Translation in UTF-8 
format=word+","+translation+"\n" 
file=open("dict.txt","w") 
file.write(format.encode("utf-8")) 
file.close()

的錯誤，我得到：

的UnicodeDecodeError「utf8'codec不能在位置0解碼字節爲0x82：無效的起始字節

編輯：這是Python的22

來源

2017-04-21 Phill

添加完整的錯誤信息，以便我們看到哪一行有問題。你在窗戶上嗎？什麼是'sys.getdefaultencoding（）'。考慮到可憐的Unicode支持是Python 3發明的原因之一，爲什麼在世界上實現這個糟糕的舊Python 2？！ – tdelaney

您應該轉向Python 3，它對Unicode有更好的支持。 – ForceBru

呵呵，什麼是'sys.stdin.encoding'？你可能想要執行'raw_input（..）。decode（sys.stdin.encoding）'來使它工作。 – tdelaney

儘管python 2支持unicode，但它的輸入並不會自動解碼爲unicode。 raw_input返回一個字符串，如果輸入ascii以外的東西，你會得到編碼的字節。訣竅是弄清楚編碼是什麼。這取決於是否將數據輸入到程序中。如果它是一個終端，那麼sys.stdin.encoding應該告訴你使用什麼編碼。如果它從一個文件傳入，那麼sys.stdin.encoding就是None，你只需要知道它是什麼。

解決您的問題如下。請注意，即使您編寫文件（編碼然後寫入）的方法起作用，codecs模塊也會導入一個文件對象來爲您做。

import sys 
import codecs 

# just randomly picking an encoding.... a command line param may be 
# useful if you want to get input from files 
_stdin_encoding = sys.stdin.encoding or 'utf-8' 

def unicode_input(prompt): 
    return raw_input(prompt).decode(_stdin_encoding) 

word=unicode_input("Word:\n") #The Word 
translation=unicode_input("Translation:\n") 
format=word+","+translation+"\n" 
with codecs.open("dict.txt","w") as myfile: 
    myfile.write(format)

來源

2017-04-21 17:25:50 tdelaney

的Python：讀UTF - 從的raw_input 8（）寫作UTF - 在文件8

回答

相關問題