Python的讀取/寫入

我的程序必須讀取有許多行的文本文件。然後它將相同的文本複製到輸出文件，除了所有無用的單詞（如「the」，「a」和「an」）被刪除。問題是什麼？Python的讀取/寫入

2013-12-08 Chingy

' 「A.TXT」'將有初步+由於您沒有清除文件，因此附加的行。不知道這是否重要。此外，你能否告訴我們問題的**症狀是什麼**，即發生了什麼事情而不是你想要發生的事情？ –

您有文件中所有行的列表。您正在遍歷列表，檢查一行是否在stopList中，其中包含三個單詞'the'，'a'，'an'。這裏有什麼不對嗎？ – aste123

這裏亞去，只需使用str.replace：

with open("a.txt","r") as fin, open("b.txt","w") as fout: 
    stopList=['the','a','an'] 
    for line in fin: 
     for useless in stopList: 
      line = line.replace(useless+' ', '') 
     fout.write(line)

如果你不想保存整個文件到內存中，你需要到別的地方寫的結果。但是，如果你不介意的話，你可以把它改寫：

with open("a.txt","r") as fin, open("a.txt","w") as fout: 
    stopList=['the','a','an'] 
    r = [] 
    for line in fin: 
     for useless in stopList: 
      line = line.replace(useless+' ', '') 
     r.append(line) 
    fout.writelines(r)

演示：

>>> line = 'the a, the b, the c' 
>>> stopList=['the','a','an'] 
>>> for useless in stopList: 
    line = line.replace(useless+' ', '') 


>>> line 
'a, b, c'

來源

2013-12-08 14:39:51 aIKid

@alKid它複製三次一個元素 – Chingy

「三次元素」是什麼意思？ – aIKid

@alKid例如它會寫「ABC」三次，比如「ABC ABC ABC」 – Chingy

使用regular expression：

import re 

with open('a.txt') as f, open('b.txt','w') as out: 
    stopList = ['the', 'a', 'an'] 
    pattern = '|'.join(r'\b{}\s+'.format(re.escape(word)) for word in stopList) 
    pattern = re.compile(pattern, flags=re.I) 
    out.writelines(pattern.sub('', line) for line in f) 

# import shutil 
# shutil.move('b.txt', 'a.txt')

來源

2013-12-08 15:01:22 falsetru

Python的讀取/寫入

回答

相關問題