Python多行正則表達式替換

我不好意思詢問還有另一個正則表達式的問題，但是這一直讓我在過去的一週裏很瘋狂。Python多行正則表達式替換

我想在Python中使用正則表達式來替換一些文字，看起來像這樣：

text = """some stuff 
line with text 
other stuff 
[code language='cpp'] 
#include <cstdio> 

int main() { 
    printf("Hello"); 
} 
[/code] 
Maybe some 
other text"""

我想要做的就是捕捉[code]標籤內的文本中，添加一個標籤（\t）每條線的前面，然後用預先標記的這些新行替換所有[code]...[/code]。也就是說，我希望結果如下所示：

"""some stuff 
line with text 
other stuff 

    #include <cstdio> 

    int main() { 
     printf("Hello"); 
    } 

Maybe some 
other text"""

我正在使用以下代碼段。

class CodeParser(object): 
    """Parse a blog post and turn it into markdown.""" 

    def __init__(self): 
     self.regex = re.compile('.*\[code.*?\](?P<code>.*)\[/code\].*', 
           re.DOTALL) 

    def parse_code(self, text): 
     """Parses code section from a wp post into markdown.""" 
     code = self.regex.match(text).group('code') 
     code = ['\t%s' % s for s in code.split('\n')] 
     code = '\n'.join(code) 
     return self.regex.sub('\n%s\n' % code, text)

的問題，這是它的所有字符之前因爲最初和最後的.*，當我進行更換，這些被刪除的code標籤後匹配。如果我刪除.*，那麼再也不會匹配任何東西。

我想這可能是用換行問題，所以我試圖用，比如說，'¬'更換所有的'\n'，進行匹配，然後改變'¬'回'\n'，但我沒有任何與此運氣做法。

如果有人有更好的方法來完成我想完成的任務，我樂意提供建議。

謝謝。

來源

2015-07-11 Andrés

你在正確的軌道上。而不是regex.match，使用regex.search。這樣你可以擺脫領先和尾隨.*s。

Try this: 
    def __init__(self): 
     self.regex = re.compile('\[code.*?\](?P<code>.*)\[/code\]', 
           re.DOTALL) 


    def parse_code(self, text): 
     """Parses code section from a wp post into markdown.""" 
     # Here we are using search which finds the pattern anywhere in the 
     # string rather than just at the beginning 
     code = self.regex.search(text).group('code') 
     code = ['\t%s' % s for s in code.split('\n')] 
     code = '\n'.join(code) 

     return self.regex.sub('\n%s\n' % code, text)

來源

2015-07-11 20:36:50 gymbrall

謝謝！我應該不斷地閱讀文檔，進一步下來...這是[在那裏]（https://docs.python.org/3/howto/regex.html#match-versus-search） –

Python多行正則表達式替換

回答

相關問題