如何匹配正則表達式的HTML標籤之外

可能重複：
RegEx match open tags except XHTML self-contained tags 如何匹配正則表達式的HTML標籤之外

我怎麼能比得上一些字母詞是外面的HTML標籤，而不是比賽的每字

例如：

<div id="mariano mariano mariano" nota="mariano/mariano">mariano was looking forward Mariano. I want to match this "Mariano" too. Mariano</div>

在這個例子中，我想匹配標籤ID之外的所有「Mariano」。

我認爲這個問題的關鍵是期待在「>」之前的「<」並且匹配該單詞，但是如果正則表達式在「<」之前找到「>」，這意味着該單詞在標籤，但我無法設法達到/產生一個正則表達式。

我試圖連接這個正則表達式(?<=^|>)[^><]+?(?=<|$)與另一個失敗。而我最終的質量最低的解決方案是：

<!-- language: lang-js --> 
var searchFor = new RegExp("((!?<=^|>)" + termino + ")","ig"); 
var searchFor2 = new RegExp("(" + termino + "(?=<|$))","ig"); 
var searchFor3 = new RegExp("(!?<=^|[\\s\\.;,])" + termino + "(?=[\\s\\.;,]|$)","ig");

但那些3不覆蓋所有的替代品。

編輯：林使用JavaScript：

<script> 
container.find("p, span, div, .texto,").each(function() { 
var containerText = $(this).html(); 
for (var i = 0; i < terms.length; i++) { 
    var termino = terms[i]; 
    // 1st issue ">termino" was remplaced for: ">Pedro" 
    var searchFor = new RegExp("((!?<=^|>)" + termino + ")","ig"); 
    containerText = containerText.replace(searchFor,">Pedroedro"); 
    // 2nd issue "termino<" was remplaced for: "Pedro" 
    var searchFor2 = new RegExp("(" + termino + "(?=<|$))","ig"); 
    containerText = containerText.replace(searchFor2,"Pedro"); 
    // 3rd issue "[\.\s,;:]termino[\.\s,;:] 
    var searchFor3 = new RegExp("(!?<=^|[\\s\\.;,])" + termino + "(?=[\\s  \\.;,]|$)","ig"); 
    containerText = containerText.replace(searchFor3," Pedro"); 
}; 
$(this).html(containerText); 
}); 
</script>

來源

2012-09-19 Mariano Ignagni

[請不要試圖用正則表達式解析HTML（http://stackoverflow.com/a/1732454/451590） –

給標記的一些例子，字符串，我們尋找。而文檔中的所有文本至少都在'body'元素內。 –

正則表達式不是解析HTML的方法。請看一下http://htmlparsing.com的一些起點。 –

有幾件事情 -

歡迎計算器！
請在詢問之前搜索問題。用正則表達式解析 xml有很多結果。

請勿使用正則表達式來解析xml/html！ Try xpath！

var termino = // how ever you were defining before... 

// Give me all divs, where the text content contains value of "termino" 
var iterator = document.evaluate('//div/text()[contains(.,' + termino + ')]', documentNode, null, XPathResult.UNORDERED_NODE_ITERATOR_TYPE, null); 

try { 
    // init thisNode to the first item in the iterator 
    var thisNode = iterator.iterateNext(); 

    // go through all items, alert their content (which should contain termino) 
    while (thisNode) { 
    alert(thisNode.textContent); 
    thisNode = iterator.iterateNext(); 
    } 
} 
catch (e) { 
    dump('Error: Document tree modified during iteration ' + e); 
}

來源

2012-09-19 22:00:47 Dave

如何匹配正則表達式的HTML標籤之外

回答

相關問題