替換詞另一

some text and some text too bad, 
some too&nbsp; bad again some bad 
and other words bad, it is too  bad

我試圖取代所有字的「壞」到「好」，但有例外：替換詞另一

若詞「太」之前的「壞」，「壞「不應該被更改爲‘好’，二者之間可以用一個或微塵空白‘太’與‘壞’，甚至HTML空白」「

後，所以正則表達式處理文本應該是

some text and some text too bad, 
    some too&nbsp; bad again some good 
    and other words good, it is too  bad

試過這樣的事情，但它不能正常工作。

$text ~= s/(too(\s+|\s*&nbsp;\s*))bad/good/ig;

請幫

來源

2013-10-25 Kirill Reva

雖然正則表達式的專家可以創造奇蹟，在最後總要有人理解和維護這樣的代碼。 –

-1

你可以嘗試解碼html空格，並應用正則表達式的計算，如果前面的字符串是too：

#!/usr/bin/env perl; 

use strict; 
use warnings; 
use HTML::Entities; 

while (<DATA>) { 
    _decode_entities($_, { nbsp => "\xA0" }); 
    s/(\w+)(\s+)bad/$1 eq 'too' ? $& : "$1$2good"/eg; 
    encode_entities($_); 
    print $_; 
} 

__DATA__ 
some text and some text too bad, 
some too&nbsp; bad again some bad 
and other words bad, it is too  bad

運行它喜歡：

perl script.pl

國債收益率：

some text and some text too bad, 
some too&nbsp; bad again some good 
and other words good, it is too  bad

來源

2013-10-25 12:21:52 Birei

那麼一個不可破壞的空間變得易碎？ – Borodin

@Borodin：謝謝你注意到這個bug。我已經添加了'encode_entities（）'函數來修復它。 – Birei

感謝Borodin和@Birei，它真的幫了我很大的忙 –

我不相信這可以方便地使用正則表達式來完成。它變得更加複雜，因爲單詞的想法尚不清楚：例如，您想將「bad」作爲單詞「bad」來對待。

該程序通過將字符串標記爲單詞和分隔符，然後將所有出現的「壞」改變爲「好」，除非它們前面有「太」（忽略大寫和小寫）。我在可能的分隔符列表中包含了逗號，冒號和分號。你可能想調整這個來獲得你期望的結果。

use strict; 
use warnings; 

my $text = <<END; 
some text and some text too bad, 
some too&nbsp; bad again some bad 
and other words bad, it is too  bad 
END 

my @tokens = split /((?:[\s,;.:]|&nbsp;)+)/, $text; 

for my $i (grep { lc $tokens[$_] eq 'bad' } 1 .. $#tokens) { 
    $tokens[$i] = 'good' unless lc $tokens[$i-2] eq 'too'; 
} 

print join '', @tokens;

輸出

some text and some text too bad, 
some too&nbsp; bad again some good 
and other words good, it is too  bad

來源

2013-10-25 12:16:57 Borodin

回答

相關問題