使用readline讀取的限制數量

我正在嘗試讀取大型文本文件的前100行。下面顯示了執行此操作的簡單代碼。然而，挑戰在於，我必須警惕腐敗或其他沒有任何換行符的文件（是的，人們以某種方式設法生成這些文件）。在這些情況下，我仍然希望閱讀數據（因爲我需要看看那裏發生了什麼），但將其限制爲n字節。使用readline讀取的限制數量

我能想到的唯一方法就是通過char讀取char文件。除了速度慢（可能不是一個問題，只有100行），我擔心當遇到使用非ASCII編碼的文件時會遇到麻煩。

是否可以限制使用readline（）讀取的字節？還是有更好的方法來處理這個問題？

line_count = 0 
with open(filepath, 'r') as f: 
    for line in f: 
     line_count += 1 
     print('{0}: {1}'.format(line_count, line)) 
     if line_count == 100: 
      break

編輯：

作爲@Fredrik正確地指出，的ReadLine（）接受限制字符的數目讀（我認爲這是一個緩衝器大小參數）的精氨酸。所以，我的目的，下面的工作得很好：

max_bytes = 1024*1024 
bytes_read = 0 

fo = open(filepath, "r") 
line = fo.readline(max_bytes) 
bytes_read += len(line) 
line_count = 0 
while line != '': 
    line_count += 1 
    print('{0}: {1}'.format(line_count, line)) 
    if (line_count == 100) or (bytes_read == max_bytes): 
     break 
    else: 
     line = fo.readline(max_bytes - bytes_read) 
     bytes_read += len(line)

來源

2016-02-05 Gadzooks34

如果你有一個文件：

f = open("a.txt", "r") 
f.readline(size)

大小參數告訴的最大字節數讀

來源

2016-02-05 13:57:34 Fredrik

有關'file'對象上可用方法的更多信息，讀者可以查看文檔ntation [here]（https://docs.python.org/2/library/stdtypes.html#file-objects）。 'readlines（）'的同義詞 – Monkpit

。最佳方法見'http：// stupidpythonideas.blogspot.fr/2013/06/readlines-considered-silly.html'。 –

就是這樣。我問這個問題感到很愚蠢。不知何故，我在腦海中得知readline（）的大小參數只是一個初始的緩衝區猜測，而不是讀取字符數量的限制。如果對任何人都有用，可以使用最終解決方案進行編輯。 – Gadzooks34

此檢查沒有換行符的數據：

f=open('abc.txt','r') 
dodgy=False 
if '\n' not in f.read(1024): 
    print "Dodgy file - No linefeeds in the first Kb" 
    dodgy=True 
f.seek(0) 
if dodgy==False: #read the first 100 lines 
    for x in range(1,101): 
     try: line = next(f) 
     except Exception as e: break 
     print('{0}: {1}'.format(x, line)) 
else: #read the first n bytes 
    line = f.read(1024) 
    print('bytes: '+line) 
f.close()

來源

2016-02-05 15:58:31

使用readline讀取的限制數量

回答

相關問題