Python正則表達式拆分提及兩年共同出現

我有以下情況，在我的字符串中，我有格式不正確的提及形式「（19561958）」，我想分裂成「（1956-1958）」。我嘗試的正則表達式爲：Python正則表達式拆分提及兩年共同出現

import re 
a = "(19561958)" 
re.sub(r"(\d\d\d\d\d\d\d\d)", r"\1-", a)

但這返回我「（19561958-）」。我怎樣才能達到我的目的？非常感謝！

來源

2015-02-08 Crista23

您可以單獨捕獲兩年，然後將兩組之間的連字符：

>>> import re 
>>> re.sub(r'(\d{4})(\d{4})', r'\1-\2', '(19561958)') 
'(1956-1958)'

注意\d\d\d\d更簡潔寫成\d{4}。

上述代碼，這將在任何八位數字的加號插入第一兩組4之間的連字符。如果您需要的比賽括號，你可以用查找變通明確將它們納入：

>>> re.sub(r''' 
    (?<=\() # make sure there's an opening parenthesis prior to the groups 
    (\d{4}) # one group of four digits 
    (\d{4}) # and a second group of four digits 
    (?=\)) # with a closing parenthesis after the two groups 
''', r'\1-\2', '(19561958)', flags=re.VERBOSE) 
'(1956-1958)'

或者，您可以用字邊界，這也將涉及例如八位數字周圍的空格：

>>> re.sub(r'\b(\d{4})(\d{4})\b', r'\1-\2', '(19561958)') 
'(1956-1958)'

來源

2015-02-08 16:07:17 jonrsharpe

您可以使用捕獲組或環視。

re.sub(r"\((\d{4})(\d{4})\)", r"(\1-\2)", a)

\d{4}完全匹配4位數。

實施例：

>>> a = "(19561958)" 
>>> re.sub(r"\((\d{4})(\d{4})\)", r"(\1-\2)", a) 
'(1956-1958)'

通過lookarounds。

>>> a = "(19561958)" 
>>> re.sub(r"(?<=\(\d{4})(?=\d{4}\))", r"-", a) 
'(1956-1958)'

(?<=\(\d{4})正回顧後，它斷言匹配必須由(和四個數字字符之前。
(?=\d{4}\)) Posiitve lookahead它聲稱匹配必須跟着4位數加)符號。
這裏邊界得到了匹配。用-替換匹配的邊界將會給你想要的輸出。

來源

2015-02-08 16:07:25

使用兩個捕獲組：r"(\d\d\d\d)(\d\d\d\d)"或r"(\d{4})(\d{4})"。

第二組引用\2。

來源

2015-02-08 16:09:19 aneroid

Python正則表達式拆分提及兩年共同出現

回答

相關問題