錯誤：'文件'對象沒有'lower'屬性

我正在玩一個停用詞過濾器我向腳本提供包含文章的文件的路徑。但是我得到的錯誤：錯誤：'文件'對象沒有'lower'屬性

Traceback (most recent call last): 
File "stop2.py", line 17, in <module> 
print preprocess(sentence) 
File "stop2.py", line 10, in preprocess 
sentence = sentence.lower() 
AttributeError: 'file' object has no attribute 'lower'

我的代碼附加以下以及任何想法，如何通過一個文件作爲參數

# -*- coding: utf-8 -*- 
from __future__ import division, unicode_literals 
import string 
import nltk 
from nltk.tokenize import RegexpTokenizer 
from nltk.corpus import stopwords 
import re 

def preprocess(sentence): 
    sentence = sentence.lower() 
    tokenizer = RegexpTokenizer(r'\w') 
    tokens = tokenizer.tokenize(sentence) 
    filtered_words = [w for w in tokens if not w in stopwords.words('english')] 
    return " ".join(filtered_words) 

sentence = open('pathtofile') 
print preprocess(sentence)

來源

2016-11-24 Silas

sentence = open(...)指那句話是file實例（從open()方法返回）;

，而它似乎要具有文件的全部內容：sentence = open(...).read()

來源

2016-11-24 21:49:52 badnews

我理解，但這種跟蹤誤差仍然存在.....回溯（最近最後一次調用）：文件「stop2.py」第17行，在 print preprocess（file）預處理中的文件「stop2.py」，第14行返回u「」「」「」.join（過濾字中f的f.decode（'utf-8'））文件「 stop2.py「，第14行，在返回u」「」「」「.join（f.decode（'utf-8'）for f in filtered_words）文件」/usr/lib/python2.7/encodings/utf_8 .py「，第16行解碼返回codecs.utf_8_decode（輸入，錯誤，True）UnicodeDecodeError：'utf8'編解碼器無法解碼位置0中的字節0xe2：意外的數據結束 – Silas

錯誤：'文件'對象沒有'lower'屬性

回答

相關問題