在SO成員的幫助下,我能夠達到如下,以下是示例代碼,目的是將來自給定文件夾和它的子文件夾的文本文件合併到將輸出存儲爲master.txt。但我偶爾會得到回溯,看起來像讀取文件時會拋出錯誤。Python 3:處理二進制模式中的剝離行
考慮到建議,輸入和一些研究,它是一個好主意,清理統一的unicode文本文件或採用一些逐行功能,所以閱讀每行應裁剪垃圾字符和空行。
import shutil
import os.path
root = 'C:\\Dropbox\\test\\'
files = [(path,f) for path,_,file_list in os.walk(root) for f in file_list]
with open('C:\\Dropbox\\Python\\master.txt','wb') as output:
for path, f_name in files:
with open(os.path.join(path, f_name), 'rb') as input:
shutil.copyfileobj(input, output)
output.write(b'\n') # insert extra newline
with open('master.txt', 'r') as f:
lines = f.readlines()
with open('master.txt', 'w') as f:
f.write("".join(L for L in lines if L.strip()))
回溯我得到:
Traceback (most recent call last):
File "C:\Dropbox\Python\master1.py", line 14, in <module>
lines = f.readlines()
File "C:\PYTHON32\LIB\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 8159: character maps to <undefined>
那麼...又是什麼問題? –
@ ignacio-vazquez-abrams什麼可以使Traceback消失。 – user1582596