Python中的UnicodeDecodeError：「ASCII」編解碼器不能在位置0解碼字節0xe2：在範圍序數不（128）

我有以下代碼：Python中的UnicodeDecodeError：「ASCII」編解碼器不能在位置0解碼字節0xe2：在範圍序數不（128）

# -*- coding: utf-8 -*- 

forbiddenWords=['for', 'and', 'nor', 'but', 'or', 'yet', 'so', 'not', 'a', 'the', 'an', 'of', 'in', 'to', 'for', 'with', 'on', 'at', 'from', 'by', 'about', 'as'] 


def IntoSentences(paragraph): 
    paragraph = paragraph.replace("–", "-") 
    import nltk.data 
    sent_detector = nltk.data.load('tokenizers/punkt/english.pickle') 
    sentenceList = sent_detector.tokenize(paragraph.strip()) 
    return sentenceList 

from Tkinter import * 

root = Tk() 

var = StringVar() 
label = Label(root, textvariable=var) 
var.set("Fill in the caps: ") 
label.pack() 

text = Text(root) 
text.pack() 

button=Button(root, text ="Create text with caps.", command = IntoSentences(text.get(1.0,END))) 
button.pack() 

root.mainloop()

當我嘗試運行這段代碼，我得到以下錯誤：

C:\Users\Indrek>C:\Python27\Myprojects\caps_main.py 
Traceback (most recent call last): 
    File "C:\Python27\Myprojects\caps_main.py", line 25, in <module> 
    button=Button(root, text ="Create text with caps.", command = IntoSentences(
text.get(1.0,END))) 
    File "C:\Python27\Myprojects\caps_main.py", line 7, in IntoSentences 
    paragraph = paragraph.replace("ŌĆō", "-") 
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal 
not in range(128)

這裏有什麼問題？我對這個問題做了一些研究，但是我讀的帖子對我沒有幫助。我應該在特定的代碼中更改什麼？

來源

2014-06-13 user244902

你確定這是正確的錯誤？我無法複製它，也無法找到足夠的編碼來模擬它的任何方式。對於這個問題，你確定這個文件是以UTF8保存的嗎？ – Veedrac

是的，我檢查，這是正確的錯誤，我很確定它保存在utf-8中，我該怎麼做才能絕對確定？ – user244902

運行'print（repr（open（filename，「rb」）））'。給我們（最好是裁剪）的輸出。 – Veedrac

我的錯誤是使用錯誤的命令，就像布萊恩奧克利說。現在我的代碼如下，一切工作：

# -*- coding: utf-8 -*- 

forbiddenWords=['for', 'and', 'nor', 'but', 'or', 'yet', 'so', 'not', 'a', 'the', 'an', 'of', 'in', 'to', 'for', 'with', 'on', 'at', 'from', 'by', 'about', 'as'] 


def IntoSentences(paragraph): 
    paragraph = paragraph.replace("–", "-") 
    import nltk.data 
    sent_detector = nltk.data.load('tokenizers/punkt/english.pickle') 
    sentenceList = sent_detector.tokenize(paragraph.strip()) 
    return sentenceList 

def new_sentences(sentenceList): 
    for i in sentenceList: 
     import re 
     from random import randint 
     s6nade_arv=len(lause.split(' ')) 
     while True: 
      asendatava_idx=randint(0,s6nade_arv-1) 
      wordList = re.sub("[^\w]", " ", lause).split() 
      asendatav_s6na=wordList[asendatava_idx] 
      if asendatav_s6na.lower() not in forbiddenWords: 
       break 
     uus_lause=lause.replace(asendatav_s6na, "______") 
     new_sentences.append(uus_lause) 

from Tkinter import * 

root = Tk() 

var = StringVar() 
label = Label(root, textvariable=var) 
var.set("Fill in the caps: ") 
label.pack() 

text = Text(root) 
text.pack() 

button=Button(root, text ="Create text with caps.", command =lambda: IntoSentences(text.get(1.0,END))) 
button.pack() 

root.mainloop()

我改變的是我加的λ：到button=Button(root, text ="Create text with caps.", command =lambda: IntoSentences(text.get(1.0,END)))

來源

2014-06-13 11:41:57 user244902

Python中的UnicodeDecodeError：「ASCII」編解碼器不能在位置0解碼字節0xe2：在範圍序數不（128）

回答

相關問題