在開始和停止標誌之間讀取多個文件塊

我正在嘗試將文件的各個部分讀入numpy數組，這些數組對文件的不同部分具有相似的啓動和停止標誌。目前我已經找到了一種可行的方法，但在輸入文件需要重新打開輸入文件之前只有一部分輸入文件。在開始和停止標誌之間讀取多個文件塊

我此刻的代碼是：

with open("myFile.txt") as f: 
     array = [] 
     parsing = False 
     for line in f: 
      if line.startswith('stop flag'): 
      parsing = False 
     if parsing: 
      #do things to the data 
     if line.startswith('start flag'): 
      parsing = True

我發現的代碼從這個question

有了這個代碼，我需要重新開放，並通過文件讀取。

有沒有辦法讀取所有部分，而不必打開每個部分讀取文件？

來源

2015-07-19 user27630

你的文件有多大/你用發電機有多舒服？ – NightShadeQueen

您可以使用itertools.takewhile每次到達開始標誌的時間採取直至停止：

from itertools import takewhile 
with open("myFile.txt") as f: 
     array = [] 
     for line in f: 
      if line.startswith('start flag'):    
       data = takewhile(lambda x: not x.startswith("stop flag"),f) 
       # use data and repeat

或者只是使用內部循環：

with open("myFile.txt") as f: 
    array = [] 
    for line in f: 
     if line.startswith('start flag'): 
      # beginning of section use first lin 
      for line in f: 
       # check for end of section breaking if we find the stop lone 
       if line.startswith("stop flag"): 
        break 
       # else process lines from section

一個文件對象返回自己的迭代器，所以p當您到達開始標誌時，ointer將繼續移動，重複執行f，開始處理一個區段直到您停止。沒有理由重新打開該文件，只需在文件的各行上迭代一次即可使用這些部分。如果開始和停止標誌線被認爲是該部分的一部分，請確保也使用這些線。

來源

2015-07-19 23:39:27

嵌套循環解決方案完美工作。謝謝。 – user27630

你有壓痕的問題，你的代碼應該是這樣的：

with open("myFile.txt") as f: 
    array = [] 
    parsing = False 
    for line in f: 
     if line.startswith('stop flag'): 
     parsing = False 
     if parsing: 
     #do things to the data 
     if line.startswith('start flag'): 
     parsing = True

來源

2015-07-19 23:39:11 Chaker

-1

比方說，這是你的文件閱讀：

**starting** blabla blabla **starting** bleble bleble **starting** bumbum bumbum

這是程序的代碼：

file = open("testfile.txt", "r") 
data = file.read() 
file.close 
data = data.split("**starting**") 
print(data)

這是輸出：

['', '\nblabla\nblabla\n', '\nbleble\nbleble\n', '\nbumbum\nbumbum']

以後你可以del空元素，或在您的data中執行其他操作。 split函數被構建爲string對象，並且可以獲取更復雜的字符串作爲參數。

來源

2015-07-19 23:43:08 Laszlowaty

與你相似的解決辦法是：

result = [] 
parse = False 
with open("myFile.txt") as f: 
    for line in f: 
     if line.startswith('stop flag'): 
      parse = False 
     elif line.startswith('start flag'): 
      parse = True 
     elif parse: 
      result.append(line) 
     else: # not needed, but I like to always add else clause 
      continue 
print result

但你也可能使用內循環或itertools.takewhile其他答案建議。特別是使用takewhile對於真正的大文件應該快得多。

來源

2015-07-19 23:47:56

在開始和停止標誌之間讀取多個文件塊

回答

相關問題