3
我試圖通過Python 3.3中的Windows(使用Git Bash shell)運行TextBlob教程。textblob中的UnicodeDecodeError教程
我已經安裝了textblob
和nltk
以及任何依賴關係。
的Python代碼是:
from text.blob import TextBlob
wiki = TextBlob("Python is a high-level, general-purpose programming language.")
tags = wiki.tags
我收到以下錯誤
Traceback (most recent call last):
File "textblob.py", line 4, in <module>
tags = wiki.tags
File "c:\Python33\lib\site-packages\text\decorators.py", line 18, in __get__
value = obj.__dict__[self.func.__name__] = self.func(obj)
File "c:\Python33\lib\site-packages\text\blob.py", line 357, in pos_tags
for word, t in self.pos_tagger.tag(self.raw)
File "c:\Python33\lib\site-packages\text\taggers.py", line 40, in tag
return pattern_tag(sentence, tokenize)
File "c:\Python33\lib\site-packages\text\en.py", line 115, in tag
for sentence in parse(s, tokenize, True, False, False, False, encoding).split():
File "c:\Python33\lib\site-packages\text\en.py", line 99, in parse
return parser.parse(unicode(s), *args, **kwargs)
File "c:\Python33\lib\site-packages\text\text.py", line 1213, in parse
s[i] = self.find_tags(s[i], **kwargs)
File "c:\Python33\lib\site-packages\text\en.py", line 49, in find_tags
return _Parser.find_tags(self, tokens, **kwargs)
File "c:\Python33\lib\site-packages\text\text.py", line 1161, in find_tags
map = kwargs.get( "map", None))
File "c:\Python33\lib\site-packages\text\text.py", line 967, in find_tags
tagged.append([token, lexicon.get(token, i==0 and lexicon.get(token.lower()) or None)])
File "c:\Python33\lib\site-packages\text\text.py", line 98, in get
return self._lazy("get", *args)
File "c:\Python33\lib\site-packages\text\text.py", line 79, in _lazy
self.load()
File "c:\Python33\lib\site-packages\text\text.py", line 367, in load
dict.update(self, (x.split(" ")[:2] for x in _read(self._path) if x.strip()))
File "c:\Python33\lib\site-packages\text\text.py", line 367, in <genexpr>
dict.update(self, (x.split(" ")[:2] for x in _read(self._path) if x.strip()))
File "c:\Python33\lib\site-packages\text\text.py", line 346, in _read
for line in f:
File "c:\Python33\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 16: character maps to <undefined>
任何想法,這裏有什麼問題?在字符串沒有幫助之前添加'u'
。
我很快就通過了該教程,它在我的OS X機器上使用Python 3.3工作正常。你可能有一個老版本的TextBlob?它看起來像一個類似的問題只是修復和發佈:https://github.com/sloria/TextBlob/issues/15 –
沒有運氣。我使用0.6.3,我相信是最新的。我做了一個pip --force-reinstall,安裝pyyaml時發現了libyaml錯誤。雖然安裝確實繼續,但我不確定這是一個嚴重的問題。 – sgoldber
爲了繼續解決這個問題,我在[nltk網站](http://nltk.org/)的首頁上通過了一個簡短的教程,並且遇到了一個非常類似的錯誤。克隆從github上的主回購解決了這個問題。也許我需要嘗試與textblob類似的東西。 – sgoldber