問題陳述:不正確使用輸出應用re.sub()
插入的期間,如果期間直接後跟一個字母之後的額外的空間。
下面是代碼:
string="This is very funny and cool.Indeed!"
re.sub("\.[a-zA-Z]", ". ", string)
和輸出:
"This is very funny and cool. ndeed!"
據'.'
後更換的第一個字符。
任何解決方案?
問題陳述:不正確使用輸出應用re.sub()
插入的期間,如果期間直接後跟一個字母之後的額外的空間。
下面是代碼:
string="This is very funny and cool.Indeed!"
re.sub("\.[a-zA-Z]", ". ", string)
和輸出:
"This is very funny and cool. ndeed!"
據'.'
後更換的第一個字符。
任何解決方案?
您可以使用positivie lookahead assertion,不消耗匹配的部分:
>>> re.sub(r"\.(?=[a-zA-Z])", ". ", string)
'This is very funny and cool. Indeed!'
使用capturing group and backreference備選:
>>> re.sub(r"\.([a-zA-Z])", r". \1", string) # NOTE - r"raw string literal"
'This is very funny and cool. Indeed!'
僅供參考,您可以使用\S
代替[A-ZA-Z]以匹配非空格字符。
瞭解到新事物。 +1 – 2014-09-25 13:49:19
你也可以在你的正則表達式中同時使用lookahead and lookbehind。
>>> import re
>>> string="This is very funny and cool.Indeed!"
>>> re.sub(r'(?<=\.)(?=[A-Za-z])', r' ', string)
'This is very funny and cool. Indeed!'
OR
您可以使用\b
,
>>> re.sub(r'(?<=\.)\b(?=[A-Za-z])', r' ', string)
'This is very funny and cool. Indeed!'
說明:
(?<=\.)
只要查看文字點之後。(?=[A-Za-z])
斷言匹配的邊界後面必須跟一個字母。
嘗試使用捕獲組 – jaap3 2014-09-25 13:41:08