如何獲取網址只能從與HTML標記的字符串

我這裏有這個代碼，檢測包含HTML內容的字符串中的URL如何獲取網址只能從與HTML標記的字符串

$regex = "/(http|https)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/"; 
       preg_match_all($regex, $desc, $url); 
       print_r($url);

代碼的偉大工程，但是print_r($url)回報的網址，如：

http://url.com/</p>

的</p>根本沒有關閉<p>標籤，但我不希望它在我的網址。

我怎麼能阻止呢？

謝謝，彼得

來源

2013-10-20 Peter Stuart

解析您的HTML，然後在文本上運行鏈接查找器？ – Ryan

我將如何解析HTML？ –

http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php – Ryan

我用了strip_tags函數刪除所有的HTML標籤，然後preg_match_all()讓每個網址：

$regex = "/(http|https)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/"; 
preg_match_all($regex, strip_tags(html_entity_decode($desc)), $url); 
print_r($url);

我希望這可以幫助其他人在未來！

Peter

來源

2013-10-20 22:36:55

如何獲取網址只能從與HTML標記的字符串

回答

相關問題