2011-06-22 55 views
1

後比方說,我有以下格式的輸入文本文件:的Python如何抓住若干行的比賽

Section1 Heading Number of lines: n1 
Line 1 
Line 2 
... 
Line n1 
Maybe some irrelevant lines 

Section2 Heading Number of lines: n2 
Line 1 
Line 2 
... 
Line n2 

該文件的某些部分開始的標題行指定多少行在那一節。每個部分的標題都有不同的名稱。

我寫了一個正則表達式,它將匹配基於用戶搜索每個節的標題名稱的標題行,解析它,然後返回數字n1/n2/etc,告訴我有多少行是在部分。我一直試圖使用for-in循環來讀取每一行,直到計數器達到n1,但目前爲止還沒有成功。

這裏是我的問題:當匹配中給出的數字和每個部分的數字不同時,如何返回匹配行後面的特定行數?我是編程新手,我很感激任何幫助。

編輯:好的,這裏的相關代碼,我到目前爲止有:

import re 
print 
fname = raw_input("Enter filename: ") 
toolname = raw_input("Enter toolname: ") 

def findcounter(fname, toolname): 
     logfile = open(fname, "r") 

     pat = 'SUCCESS Number of lines :' 
     #headers all have that format 
     for line in logfile: 
       if toolname in line: 
        if pat in line: 
          s=line 

     pattern = re.compile(r"""(?P<name>.*?)  #starting name 
          \s*SUCCESS  #whitespace and success 
          \s*Number\s*of\s*lines #whitespace and strings 
          \s*\:\s*(?P<n1>.*)""",re.VERBOSE) 
     match = pattern.match(s) 
     name = match.group("name") 
     n1 = int(match.group("n1")) 
     #after matching line, I attempt to loop through the next n1 lines 
     lcount = 0 
     for line in logfile: 
      if line == match: 
        while lcount <= n1: 
           match.append(line) 
           lcount += 1 
           return result 

文件本身是相當長的,而且有很多我感興趣的章節之間穿插無關線什麼。我不太確定如何指定直接在匹配行後面打印行。

+1

你能證明你有這麼遠相關的代碼?請閱讀[this](/)頁面,然後更新您的問題。 –

+0

我知道這是f ***的老,但你救了我的命。 int(match.group(「n1」))... <3 –

回答

1
# f is a file object 
# n1 is how many lines to read 
lines = [f.readline() for i in range(n1)] 
+0

但是,如何指定只在匹配的標題行之後直接打印行? –

+0

@Simos:你一個接一個地從文件中讀取行。只要當前行符合您的過濾器,請執行我提供的代碼。請注意,文件中的每個'readline'都會將文件的內部讀指針移動到下一行 –

0

你可以把邏輯是這樣的發電機:

def take(seq, n): 
    """ gets n items from a sequence """ 
    return [next(seq) for i in range(n)] 

def getblocks(lines): 
    # `it` is a iterator and knows where we are in the list of lines. 
    it = iter(lines) 
    for line in it: 
     try: 
      # try to find the header: 
      sec, heading, num = line.split() 
      num = int(num) 
     except ValueError: 
      # didnt work, try the next line 
      continue 

     # we got a header, so take the next lines 
     yield take(it, num) 

#test 
data = """ 
Section1 Heading 3 
Line 1 
Line 2 
Line 3 

Maybe some irrelevant lines 

Section2 Heading 2 
Line 1 
Line 2 
""".splitlines() 

print list(getblocks(data))