我想匹配以下輸入。如何在不使用多行字符串的情況下將組匹配一定次數？像（^（\ d +）（。+）$）{3}）（但不起作用）。正則表達式：完全匹配三行

sample_string = """Breakpoint 12 reached 
     90 good morning 
    91 this is cool 
    92 this is bananas 
    """ 
pattern_for_continue = re.compile("""Breakpoint \s (\d+) \s reached \s (.+)$ 
           ^(\d+)\s+ (.+)\n 
           ^(\d+)\s+ (.+)\n 
           ^(\d+)\s+ (.+)\n 
            """, re.M|re.VERBOSE) 
matchobj = pattern_for_continue.match(sample_string) 
    print matchobj.group(0)

來源

2013-03-18 Rose Perrone

將'$'更改爲'\ n'。 – hughdbrown 2013-03-18 17:28:02

您對VERBOSE的使用會使所有*空格不匹配，因此第一行數字周圍的空格也會被忽略。 – 2013-03-18 17:30:52

此外，在多行正則表達式中，空格不是正則表達式的一部分 - 它們被視爲comemnts。你需要明確地插入'\ s +'和'\ s *'。 – hughdbrown 2013-03-18 17:31:45

有一系列的與你的表達和樣品的問題：

您使用詳細的讓所有空間不匹配，那麼您的空間第一行的數字也被忽略。將空格替換爲\s或[ ]（後者僅匹配文字空間，前者與新行和標籤匹配）。
您的輸入示例在每行上的數字之前有空格，但示例模式要求數字位於行首。要麼允許該空白或修復您的示例輸入。
最大的問題是重複組內的一個捕獲組（因此(\d+)位於較大組中，最後{3}）僅捕獲最後一個匹配。你會得到92和this is bananas，而不是前兩個匹配的行。

爲了克服這一切，你有重複這一模式的三線明確。你可以使用Python來實現一個重複：

linepattern = r'[ ]* (\d+) [ ]+ ([^\n]+)\n' 

pattern_for_continue = re.compile(r""" 
    Breakpoint [ ]+ (\d+) [ ]+ reached [ ]+ ([^\n]*?)\n 
    {} 
""".format(linepattern * 3), re.MULTILINE|re.VERBOSE)

其中，爲您的樣品輸入，返回：

>>> pattern_for_continue.match(sample_string).groups() 
('12', '', '90', 'hey this is a great line', '91', 'this is cool too', '92', 'this is bananas')

如果你真的不想在3條額外的線位之前相匹配的空間，您可以從linepattern中刪除第一個[ ]*模式。

來源

2013-03-18 17:55:33

謝謝！我已經更新了我的問題。 – 2013-03-19 00:35:07

@Rose：'input2'在最後缺少換行符。每行*必須*有匹配模式的換行符。 – 2013-03-19 00:39:58

完美。謝謝！ – 2013-03-19 01:30:07

代碼

你需要更多的東西是這樣的：

import re 

sample_string = """Breakpoint 12 reached 
90 hey this is a great line 
91 this is cool too 
92 this is bananas 
""" 
pattern_for_continue = re.compile(r""" 
    Breakpoint\s+(\d+)\s+reached\s+\n 
    (\d+) ([^\n]+?)\n 
    (\d+) ([^\n]+?)\n 
    (\d+) ([^\n]+?)\n 
""", re.MULTILINE|re.VERBOSE) 
matchobj = pattern_for_continue.match(sample_string) 

for i in range(1, 8): 
    print i, matchobj.group(i) 
print "Entire match:" 
print matchobj.group(0)

結果

1 12 
2 90 
3 hey this is a great line 
4 91 
5 this is cool too 
6 92 
7 this is bananas 
Entire match: 
0 Breakpoint 12 reached 
90 hey this is a great line 
91 this is cool too 
92 this is bananas

原因

re.VERBOSE在您的正則表達式中顯示必要的空格。我通過在多行字符串中對數據進行左對齊來部分解決了這個問題。我認爲這是合理的，因爲你可能沒有真正的代碼;它可能是一個多行字符串編輯工件。您需要將替換爲\n。
需要非貪婪匹配

來源

2013-03-18 17:43:28 hughdbrown

這可能與您的'sample_string'匹配，但它與OP'sample_string'不匹配。 – 2013-03-18 17:51:02

我修改了OP的示例字符串yes。由於這是一段剪切和粘貼的代碼，帶有縮進的「」「...」「」，很可能她的真實數據看起來也不是這樣。 – hughdbrown 2013-03-18 17:53:17

正則表達式：完全匹配三行

回答

代碼

結果

原因

相關問題