將Perl正則表達式轉換爲Python正則表達式

我無法將Perl正則表達式轉換爲Python。我想匹配的文本具有以下模式：將Perl正則表達式轉換爲Python正則表達式

 
Author(s) : Firstname Lastname 
       Firstname Lastname 
       Firstname Lastname 
       Firstname Lastname

在Perl中，我能夠匹配這個與

/Author\(s\) :((.+\n)+?)/

提取作家當我嘗試

re.compile(r'Author\(s\) :((.+\n)+?)')

在Python中，它與第一作者匹配兩次，並忽略其餘部分。

任何人都可以解釋我在這裏做錯了嗎？

來源

2011-01-23 Eric Seidel

你最近在做什麼比賽？編譯，編譯。 – 2011-01-23 22:50:16

你可以這樣做：

# find lines with authors 
import re 

# multiline string to simulate possible input 
text = ''' 
Stuff before 
This won't be matched... 
Author(s) : Firstname Lastname 
       Firstname Lastname 
       Firstname Lastname 
       Firstname Lastname 
Other(s)  : Something else we won't match 
       More shenanigans.... 
Only the author names will be matched. 
''' 

# run the regex to pull author lines from the sample input 
authors = re.search(r'Author\(s\)\s*:\s*(.*?)^[^\s]', text, re.DOTALL | re.MULTILINE).group(1)

上述正則表達式開始的文本匹配（作者（S），空格，冒號，空格），它爲您提供了以下結果由所有行匹配之後即用空格開頭：

'''Firstname Lastname 
      Firstname Lastname 
      Firstname Lastname 
      Firstname Lastname 
'''

然後，您可以使用下面的正則表達式組的所有作者從這些結果

# grab authors from the lines 
import re 
authors = '''Firstname Lastname 
      Firstname Lastname 
      Firstname Lastname 
      Firstname Lastname 
''' 

# run the regex to pull a list of individual authors from the author lines 
authors = re.findall(r'^\s*(.+?)\s*$', authors, re.MULTILINE)

，讓你作者的名單：

['Firstname Lastname', 'Firstname Lastname', 'Firstname Lastname', 'Firstname Lastname']

結合實例代碼：

text = ''' 
Stuff before 
This won't be matched... 
Author(s) : Firstname Lastname 
       Firstname Lastname 
       Firstname Lastname 
       Firstname Lastname 
Other(s)  : Something else we won't match 
       More shenanigans.... 
Only the author names will be matched. 
''' 

import re 
stage1 = re.compile(r'Author\(s\)\s*:\s*(.*?)^[^\s]', re.DOTALL | re.MULTILINE) 
stage2 = re.compile('^\s*(.+?)\s*$', re.MULTILINE) 

preliminary = stage1.search(text).group(1) 
authors = stage2.findall(preliminary)

這臺創作者：

['Firstname Lastname', 'Firstname Lastname', 'Firstname Lastname', 'Firstname Lastname']

成功！

來源

2011-01-23 23:44:11 lunixbochs

謝謝！這很完美！ – 2011-01-24 00:38:45

一組只能匹配一次。因此，即使您的匹配組重複，您也只能訪問上次的實際匹配。你必須一次匹配所有的名字，然後拆分它們（通過換行符，甚至是新的正則表達式）。

來源

2011-01-23 22:52:16 poke

謝謝，我基於這個答案的一部分。 – lunixbochs 2011-01-24 00:41:16

嘗試

re.compile(r'Author\(s\) :((.+\n)+)')

在原來的表達式中，+?表示要匹配非貪婪，即最小。

來源

2011-01-23 22:55:10

將Perl正則表達式轉換爲Python正則表達式

回答

相關問題