查找網址，而忽略HTML標籤

我得到了下面的text/html：查找網址，而忽略HTML標籤

Hello ! You should check this link : http://google.com 
And this link too : <a href="http://example.com">http://example2.com</a>

我希望有一個正則表達式趕在我的文字的網址由<a>來取代它們。我得到了以下的正則表達式：

var REG_EXP = /[^">]((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w][email protected])?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w][email protected])[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%@.\w_]*)#?(?:[\w]*))?)[^"<]/gi;

但我的正則表達式也搭上http://example.com和http://example2.com。我不知道如何改進它以避免這種情況。

來源

2013-06-25 Magus

我不明白。你只想圍繞谷歌網址''，但不是example2，對吧？ – sp00m

重複？ http://stackoverflow.com/questions/287144/need-a-good-regex-to-convert-urls-to-links-but-leave-existing-links-alone –

也許這不是一個好的正則表達式用於正則表達式：見http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html – Paul

檢查此答案https://stackoverflow.com/a/4217452/1795220。肯定有像<a href="http://example.com">http://example2.com</a>這樣的HTML是不正確的。

來源

2013-06-25 07:57:02 igo

這可能滿足您的需求：

(?<!href=")(http://[a-z0-9]++(?:[.-:/?&=][a-z0-9]+)++)(?!</a>)

請注意，我用的URL模式是非常簡單和寬容：

http://[a-z0-9]+(?:[.-:/?&=][a-z0-9]+)+

(?<!href=")指「不被href="前面」
(?!</a>)表示「之後沒有</a>「
++被稱爲possessive quantifier

僅僅通過<a href="$1">$1</a>在this example更換匹配。

不要期望從正則表達式嘗試解決這種工作太多，這不是他們所做的。

來源

2013-06-25 08:11:57 sp00m

查找網址，而忽略HTML標籤

回答

相關問題