2017-10-15 44 views
0

我有一個問題,我似乎無法解決;如果這是重複的道歉,但永遠不會有真正的答案。我從配置文件中提取特定的信息,以文本塊的形式顯示信息,我只需要打印特定的塊,而不需要標題。因此,例如,(與下面的文本格式),我只會想捕捉低於頭2的信息,但沒有什麼過去頭3:Python 3.x打印特定標題後的行數

# output could containmultiple headers, and lines, or no lines per header this is an example of what could be present but it is not absolute. 

header1 
------- 
line1 
line2 
line3 # can be muiplies availables or known 

header2 
------- 
line1 
line2 
line3 # can be muiplies availables or known 

header3 
------- 

header4 
------- 
line1 
line2 
line3 # can be multiple linnes or none not known 

這裏是我開始,但卡在第二循環布爾碼或邏輯,用於以打印頭塊的唯一的行:

Raw_file = "scrap.txt" 
scrape = open(Raw_file,"r") 


for fooline in scrape: 

     if "Header" in fooline: 
       #print(fooline) # prints all lines 
        #print lines under header 2 and stop before header 3 



scrape.close() 

回答

2

使用的標題行檢測到打開/關閉,控制打印的布爾:

RAW_FILE = "scrap.txt" 

DESIRED = 'header2' 

with open(RAW_FILE) as scrape: 

    printing = False 

    for line in scrape: 

     if line.startswith(DESIRED): 
      printing = True 
     elif line.startswith('header'): 
      printing = False 
     elif line.startswith('-------'): 
      continue 
     elif printing: 
      print(line, end='') 

OUTPUT

> python3 test.py 
line1 
line2 
line3 # can be muiplies availables or known 

> 

根據需要進行調整。

+0

這是極好的感謝,如果我也想打印在該行的對象,我會怎樣去做。我嘗試分割它並打印行[0]以獲得'3'。 line sample =「3 man enable none」,但沒有運氣不斷返回一個沒有對象,也許我不理解的東西。 – onxx

0

可以設置,啓動和停止收集,基於匹配header2header3內容的標誌。

隨着example.txt含有提供的完整數據。例如:

f = "example.txt" 
scrape = open(f,"r") 

collect = 0 
wanted = [] 

for fooline in scrape: 
    if "header2" in fooline: 
     collect = 1 
    if "header3" in fooline: 
     collect = 2 

    if collect == 1: 
     wanted.append(fooline) 
    elif collect == 2: 
     break 

scrape.close() 

wanted輸出:

['header2\n', 
'-------\n', 
'line1\n', 
'line2\n', 
'line3 # can be muiplies availables or known\n', 
'\n'] 
0

最初,將flag設置爲False。檢查該行是否以header2開頭。如果True,則設置爲flag。如果該行以header3開頭,請將flag設置爲False

如果設置了flag,則打印行。

Raw_file = "scrap.txt" 
scrape = open(Raw_file,"r") 
flag = False 

for fooline in scrape: 
    if fooline.find("header3") == 0: flag = False # or break 
    if flag: 
     print(fooline) 
    if fooline.find("header2") == 0: flag = True 
scrape.close() 

輸出:

------- 

line1 

line2 

line3 # can be muiplies availables or known 
1

您可以考慮使用正則表達式來打破成塊這一點。

如果該文件是管理的規模,只是看它一下子和使用正則表達式,如:

(^header\d+[\s\S]+?(?=^header|\Z)) 

把它分解成塊。 Demo

然後Python代碼看起來像這樣(得到頭之間的任何文本):

import re 

with open(fn) as f: 
    txt=f.read() 

for m in re.finditer(r'(^header\d+[\s\S]+?(?=^header|\Z))', txt, re.M): 
    print(m.group(1)) 

如果該文件是不是你想要一飲而盡讀什麼更大,你可以使用mmap與一個正則表達式,並以相當大的塊讀取一個文件。

如果您正在尋找只有一個頭,是,更容易:

m=re.search(r'(^header2[\s\S]+?(?=^header|\Z))', txt, re.M) 
if m: 
    print(m.group(1)) 

Demo of regex