Python - re.sub無需替換正則表達式的一部分

因此，例如，我有一個字符串「完美的熊尋寶」，我想用「the」之前的單詞替換「bear」之前的單詞。Python - re.sub無需替換正則表達式的一部分

所以生成的字符串將是「熊狩獵」

我想我會用

re.sub("\w+ bear","the","perfect bear hunts")

，但它取代「熊」了。我如何排除熊被替換，同時也用於匹配？

來源

2017-10-05 Gillian

@Rawing非常好，編輯它 – Gillian

像其他答案一樣，我會使用積極的lookahead斷言。

然後，爲了解決拉夫在幾個評論中提出的問題（關於「鬍子」這樣的詞怎麼樣？），我會添加(\b|$)。這匹配一個字邊界或字符串的結尾，所以你只匹配單詞bear，而不再是。

所以你會得到如下：

import re 

def bear_replace(string): 
    return re.sub(r"\w+ (?=bear(\b|$))", "the ", string)

和測試用例（使用pytest）：

import pytest 

@pytest.mark.parametrize('string, expected', [ 
    ("perfect bear swims", "the bear swims"), 

    # We only capture the first word before 'bear 
    ("before perfect bear swims", "before the bear swims"), 

    # 'beard' isn't captured 
    ("a perfect beard", "a perfect beard"), 

    # We handle the case where 'bear' is the end of the string 
    ("perfect bear", "the bear"), 

    # 'bear' is followed by a non-space punctuation character 
    ("perfect bear-string", "the bear-string"), 
]) 
def test_bear_replace(string, expected): 
    assert bear_replace(string) == expected

來源

2017-10-05 15:33:51 alexwlchan

對不起，我很挑剔，但我想指出，如果「熊」一詞後面跟着任何標點符號 - 「熊」，熊（\ s | $）'不匹配。或者「熊，誰」等。我建議使用單詞邊界'\ b'來代替（儘管承認這不是一個完美的解決方案;例如它會匹配「熊大小」）。 –

@Rawing Nitpicky很好！固定。 – alexwlchan

Look Behind and Look Ahead正則表達式就是你要找的。

re.sub(".+(?=bear)", "the ", "prefect bear swims")

來源

2017-10-05 14:55:07 hspandher

這將替換所有的一切人物「熊」之前。試試這個「我的長鬍子」。 –

這將產生'thebear swims' – Igle

使用正先行熊之前更換的一切：

re.sub(".+(?=bear)","the ","perfect bear swims")

.+將捕捉任何字符（除行終止）。

來源

2017-10-05 14:56:30 Igle

這將逐字地替換字符「熊」之前的所有內容，而不僅僅是前面的單詞。試試這個「我的長鬍子」看到問題... –

用空格更新。感謝提示;） – Igle

它仍然將「大熊」變成「熊」而不是「熊」。 OP表示他們希望在「熊」之前替換_字，而不是整個字符串。你去完全改變了OP的'\ w +'，絕對沒有任何理由。 –

替代使用向前看符號：

捕捉你想用一組()，以保持和更換使用\1重新插入的部分。

re.sub("\w+ (bear)",r"the \1","perfect bear swims")

來源

2017-10-05 14:57:55 Felk

請注意，這也會匹配「鬍子」等字樣。你應該考慮添加一個字邊界'\ b'。 –

Python - re.sub無需替換正則表達式的一部分

回答

相關問題