Python - 如何打開文件並指定偏移量（以字節爲單位）？

我正在編寫一個程序，它將定期解析Apache日誌文件以記錄它的訪問者，帶寬使用情況等。Python - 如何打開文件並指定偏移量（以字節爲單位）？

問題是，我不想打開日誌並解析數據已解析。例如：

line1 
line2 
line3

如果我解析該文件，我將保存所有行，然後保存該偏移量。這樣一來，當我再次解析它，我得到：

line1 
line2 
line3 - The log will open from this point 
line4 
line5

繞第二圈時，我會得到4號線和LINE5。希望這是有道理的...

我需要知道的是，我該如何做到這一點？ Python有seek（）函數來指定偏移量...所以，我只是在解析後得到日誌的文件大小（以字節爲單位），然後在第二次登錄時使用它作爲偏移量（在seek（）中）？

我似乎無法想出一種方法來編碼這個>。 <

來源

2010-07-21 dave

您可以管理文件感謝位置到file類的seek和tell方法參見 https://docs.python.org/2/tutorial/inputoutput.html

的tell方法會告訴你去哪裏尋求下次打開

來源

2010-07-21 12:38:33 luc

這看起來像它會做我想要的。乾杯。 – dave 2010-07-21 13:04:39

嗯，似乎該鏈接需要更新。沒有對文件對象的引用;也許：https://docs.python.org/2/tutorial/inputoutput.html – cevaris 2016-07-02 14:42:17

如果你的日誌文件在內存中（這是，你要有一個合理的輪換政策），您可以輕鬆地這樣做很容易適應：

log_lines = open('logfile','r').readlines() 
last_line = get_last_lineprocessed() #From some persistent storage 
last_line = parse_log(log_lines[last_line:]) 
store_last_lineprocessed(last_line)

如果你不能做到這一點，你可以使用像（見接受的答案的使用seek和tell，如果你需要他們這樣做）Get last n lines of a file with Python, similar to tail

來源

2010-07-21 12:38:31

日誌是虛擬主機，所以目前沒有日誌輪換。我想我應該考慮設置它......這會讓你的解決方案變得非常有用。乾杯。 – dave 2010-07-21 13:05:30

如果你分析每行的日誌行，你可以從探微最後分析保存行號。那麼你下次開始閱讀好線時就會有理由。

當您必須在文件中的特定位置時，尋求更有用。

來源

2010-07-21 12:40:28

容易，但不建議:)：

last_line_processed = get_last_line_processed()  
with open('file.log') as log 
    for record_number, record in enumerate(log): 
     if record_number >= last_line_processed: 
      parse_log(record)

來源

2010-07-21 12:41:47 systempuntoout

log = open('myfile.log') 
pos = open('pos.dat','w') 
print log.readline() 
pos.write(str(f.tell()) 
log.close() 
pos.close() 

log = open('myfile.log') 
pos = open('pos.dat') 
log.seek(int(pos.readline())) 
print log.readline()

當然，你不應該使用這樣的 - 你應該換行操作起來像save_position(myfile)和load_position(myfile)功能，但功能全那裏。

來源

2010-07-21 12:44:54

注意，您可以在Python從文件末尾尋求（）：

f.seek(-3, os.SEEK_END)

使讀取位置從EOF 3線。

但是，爲什麼不使用diff，無論是從外殼還是與difflib？

來源

2010-07-21 12:45:23 user106514

這實際上會把讀取位置從EOF 3個字符，而不是3行。 – Duncan 2010-07-21 12:59:21

這裏的代碼中使用你的長度sugestion和告訴methond證明：

beginning="""line1 
line2 
line3""" 

end="""- The log will open from this point 
line4 
line5""" 

openfile= open('log.txt','w') 
openfile.write(beginning) 
endstarts=openfile.tell() 
openfile.close() 

open('log.txt','a').write(end) 
print open('log.txt').read() 

print("\nAgain:") 
end2 = open('log.txt','r') 
end2.seek(len(beginning)) 

print end2.read() ## wrong by two too little because of magic newlines in Windows 
end2.seek(endstarts) 

print "\nOk in Windows also" 
print end2.read() 
end2.close()

來源

2010-07-21 12:59:16

這是一種高效，安全的片段做保存在一個文件parallell偏移讀取。基本上Python中的logtail。

with open(filename) as log_fd: 
    offset_filename = os.path.join(OFFSET_ROOT_DIR,filename) 
    if not os.path.exists(offset_filename): 
     os.makedirs(os.path.dirname(offset_filename)) 
     with open(offset_filename, 'w') as offset_fd: 
      offset_fd.write(str(0)) 
    with open(offset_filename, 'r+') as offset_fd: 
     log_fd.seek(int(offset_fd.readline()) or 0) 
     new_logrows_handler(log_fd.readlines()) 
     offset_fd.seek(0) 
     offset_fd.write(str(log_fd.tell()))

來源

2012-02-24 10:53:04

Python - 如何打開文件並指定偏移量（以字節爲單位）？

回答

相關問題