關閉讀取文件並打開才能寫出搜索結果字符串輸出文件再次要求

我有以下代碼：關閉讀取文件並打開才能寫出搜索結果字符串輸出文件再次要求

import fileinput, os, glob, re 

# Find text file to search in. Open. 
filename = str(glob.glob('*.txt'))[2:][:-2] 
print("found " + filename + ", opening...") 
f = open(filename, 'r') 

# Create output csv write total found occurrences of search string after name of search string 
with open(filename[:-4] + 'output.csv','w') as output:  
    output.write("------------Group 1----------\n") 
    output.write(("String 1,") + str((len(re.findall(r's5 .*w249 w1025 w301 w1026 .*',f.read())))) +"\n") 
    output.write(("String 1 reverse,") + str((len(re.findall(r's5 .*w1026 w301 w1025 w249 .*',f.read())))) +"\n") 

# close and finish 
f.close 
output.close

它成功地找到了第一個字符串，並寫入總數到輸出文件，但它爲'String 1 reverse'寫入零查找，即使它應該找到1000。

f.close 
f = open(filename, 'r')

即我關閉讀文件，然後再次打開它：

，如果我插入此搜索字符串1和字符串1個反向之間的作品。

我不想在每個搜索行之後添加這個，發生了什麼？這是否與在正則表達式中緩存打開的文件或緩存有關？

感謝

來源

2017-08-14 JamesR

'f.close（）''不f.close' –

我沒有樣本，以測試你的榜樣，但我懷疑，這個問題來自：

output.write(("String 1,") + str((len(re.findall(r's5 .*w249 w1025 w301 w1026 .*',f.read())))) +"\n") 
output.write(("String 1 reverse,") + str((len(re.findall(r's5 .*w1026 w301 w1025 w249 .*',f.read())))) +"\n")

你正在做f.read()兩次，這意味着整個文件讀，然後將光標設置在文件的末尾。第二個f.read()將返回一個空字符串，因爲沒有更多數據要讀取。

你要記住，閱讀文件意味着，讀光標（連接到文件描述符的位置），將閱讀n字節後更改的+n字節。沒有參數f.read()將讀取整個文件大小字節，並將光標留在文件末尾。

有兩種解決方法：

存儲在一個變量（如：content = f.read()）文件內容和變量進行搜索。
使用文件查找功能：

要更改文件對象的位置，使用f.seek（偏移，from_what）。該位置是通過將偏移量添加到參考點來計算的;參考點由from_what參數選擇。從文件開始的from_what值爲0度量，1使用當前文件位置，2使用文件結尾作爲參考點。 from_what可以省略，默認爲0，使用文件的開頭作爲參考點。

https://docs.python.org/3/tutorial/inputoutput.html

第一個解決方案實際上建議：你不需要讀取文件不止一次，並尋求功能主要用於大型文件操作。

這裏是你的代碼之後，建議一個固定的版本：

import fileinput, os, glob, re 

# Find text file to search in. Open. 
filename = str(glob.glob('*.txt'))[2:][:-2] 
print("found " + filename + ", opening...") 
content = open(filename, 'r').read() 

# Create output csv write total found occurrences of search string after name of search string 
with open(filename[:-4] + 'output.csv','w') as output:  
    output.write("------------Group 1----------\n") 
    output.write(("String 1,") + str((len(re.findall(r's5 .*w249 w1025 w301 w1026 .*',content)))) +"\n") 
    output.write(("String 1 reverse,") + str((len(re.findall(r's5 .*w1026 w301 w1025 w249 .*',content)))) +"\n")

優化：請注意，您不需要close()的變量現在，你不停地沒有文件實例的引用。

來源

2017-08-14 05:21:39 Fabien

一旦你做了一個file.read()，整個文件被讀取，指針位於文件的末尾;這就是爲什麼第二行不返回任何結果。

您需要先閱讀內容，然後運行分析：

print("found " + filename + ", opening...") 
f = open(filename, 'r') 
contents = f.read() 
f.close() # -- note f.close() not f.close 

results_a = re.findall(r's5 .*w249 w1025 w301 w1026 .*',contents) 
results_b = re.findall(r's5 .*w1026 w301 w1025 w249 .*',contents) 

with open(filename[:-4] + 'output.csv','w') as output:  
    output.write("------------Group 1----------\n") 
    output.write("String 1 {}\n".format(len(results_a))) 
    output.write("String 1 reverse, {}\n".format(len(results_b)))

你不需要output.close（它沒有這樣做擺在首位的任何東西），因爲with語句自動將關閉文件。

如果要重複此操作爲所有符合你的格局是，文件：

import glob 
import re 
import os 

BASE_DIR = '/full/path/to/file/directory' 

for file in glob.iglob(os.path.join(BASE_DIR, '*.txt')): 
    with open(file) as f: 
    contents = f.read() 
    filename = os.path.splitext(os.path.basename(f))[0] 
    results_a = re.findall(r's5 .*w249 w1025 w301 w1026 .*',contents) 
    results_b = re.findall(r's5 .*w1026 w301 w1025 w249 .*',contents) 
    with open(os.path.join(BASE_DIR, '{}output.csv'.format(filename), 'w') as output: 
     output.write("------------Group 1----------\n") 
     output.write("String 1 {}\n".format(len(results_a))) 
     output.write("String 1 reverse, {}\n".format(len(results_b)))

來源

2017-08-14 05:22:20

關閉讀取文件並打開才能寫出搜索結果字符串輸出文件再次要求

回答

相關問題