我想用正則表達式從HTML文件中提取多個URL。 HTML代碼如下所示:任何人都可以幫我解決我的正則表達式問題嗎?
<h1 class="article"><a href="http://www.domain1.com/page-to-article1" onmousedown="return(...)
<h1 class="article"><a href="http://www.domain2.com/page-to-article2" onmousedown="return(...)
<h1 class="article"><a href="http://www.domain3.com/page-to-article3" onmousedown="return(...)
<h1 class="article"><a href="http://www.domain3.com/page-to-article4" onmousedown="return(...)
我想只有<h1 class="article"><a href="
和" onmousedown="return(...)
例如之間提取URL http://www.domain1.com/page-to-article1
,http://www.domain2.com/page-to-article2
, http://www.domain3.com/page-to-article3
等
[問]是一個很好的指南。你的Q的答案在這裏:http://stackoverflow.com/a/1732454 – brasofilo
'DOMDocument' with'DOMXPath',query for'// h1 [@ class ='article']/a/@ href' – Wrikken