不正確使用輸出應用re.sub（）

插入的期間，如果期間直接後跟一個字母之後的額外的空間。

下面是代碼：

string="This is very funny and cool.Indeed!" 

re.sub("\.[a-zA-Z]", ". ", string)

和輸出：

"This is very funny and cool. ndeed!"

據'.'後更換的第一個字符。

任何解決方案？

來源

2014-09-25 Noffil Chougle

嘗試使用捕獲組 – jaap3 2014-09-25 13:41:08

您可以使用positivie lookahead assertion，不消耗匹配的部分：

>>> re.sub(r"\.(?=[a-zA-Z])", ". ", string) 
'This is very funny and cool. Indeed!'

使用capturing group and backreference備選：

>>> re.sub(r"\.([a-zA-Z])", r". \1", string) # NOTE - r"raw string literal" 
'This is very funny and cool. Indeed!'

僅供參考，您可以使用\S代替[A-ZA-Z]以匹配非空格字符。

來源

2014-09-25 13:43:58 falsetru

瞭解到新事物。 +1 – 2014-09-25 13:49:19

你也可以在你的正則表達式中同時使用lookahead and lookbehind。

>>> import re 
>>> string="This is very funny and cool.Indeed!" 
>>> re.sub(r'(?<=\.)(?=[A-Za-z])', r' ', string) 
'This is very funny and cool. Indeed!'

您可以使用\b，

>>> re.sub(r'(?<=\.)\b(?=[A-Za-z])', r' ', string) 
'This is very funny and cool. Indeed!'

說明：

(?<=\.)只要查看文字點之後。
(?=[A-Za-z])斷言匹配的邊界後面必須跟一個字母。
如果是，則用空格替換邊界。

來源

2014-09-25 13:52:58

不正確使用輸出應用re.sub（）

回答

相關問題