你可以去:
(?| # a so called "branch reset", only supported by the regex module
c\ # a "c "
(?P<catcher>.{2,}?) # at least two characters, lazily -> group "catcher"
\ b\ # followed by " b "
| # or
c\ & \ b\ # "c & b "
(?P<catcher>.+) # capture the rest of the string -> group "catcher"
)
在
Python
代碼:
# the newer regex module
import regex as re
rx = re.compile(r'''
(?|
c\
(?P<catcher>.{2,}?)
\ b\
|
c\ & \ b\
(?P<catcher>.+))
''', re.VERBOSE)
sampletext = """
c Soumya Sarkar b Rubel Hossain
c Imrul Kayes b Mosaddek Hossain
c & b Sodhi
c Anderson b Boult
"""
catchers = [m.group('catcher') for m in rx.finditer(sampletext)]
print(catchers)
# ['Soumya Sarkar', 'Imrul Kayes', 'Sodhi', 'Anderson']
看到它working on regex101.com。
您需要有新的regex
模塊(pip install regex
)才能完成此項工作。
來源
2017-01-08 13:44:26
Jan
但它確實返回了「b」之後的投球手名稱。它也不會在c&b情況下選擇任何東西。你也可以添加一些解釋 – Neel
我試圖在regex101我會嘗試在我的代碼,然後更新 – Neel
我在第一次編輯(關於您發佈您的評論的時間)更正了正則表達式:您可以重試regex101,它應該工作。 – Faibbus