2015-11-21 27 views
0

你好,這是我的代碼:****文件的正則表達式,得到的值結束標記

<?php 
require('/simple_html_dom.php'); 
$html = new simple_html_dom(); 
$html = file_get_html('proxys.html'); 

$items = array(); 
$re = "/<td class=\\\"t_ip\\\">\\s*((?:[0-9]{1,3}\\.){3}[0-9]{1,3})\\s*<\\/td>(?:.*?)*<td class=\"t_port\">(?:.*?)\\w+\\^\\w+\\^([0-9]{1,5})(?:.*?)<td class=\"t_type\">\\s*([0-9])(?:.*?)/"; 

     preg_match_all($re, $html, $matches, PREG_SET_ORDER); 
     foreach ($matches as $val) { 
     echo nl2br($val[1] . ':' . $val[2] . ' ' . $val[3] . "\n"); 
     }; 

?> 

proxys.html

<td class="t_ip">104.131.248.140</td><td class="t_port">   <script type="text/javascript">   //<![CDATA[    document.write(BigBlind^BigBlind^60088);   //]]>   </script>50088   </td><td class="t_type">  5   </td><td class="t_ip">79.101.32.14</td><td class="t_port">   <script type="text/javascript">   //<![CDATA[    document.write(Polymorth^Polymorth^1080);   //]]>   </script>45080   </td> 

是獲得問題的值 「60088」 .WRITE(BigBlind^^ BigBlind ); ****

104.131.248.140: 60088 5 
79.101.32.14:  1080 4 

,並希望得到日< /腳本的E值爲>

104.131.248.140: 50088 5 
79.101.32.14:  45080 4 

我失去了與正則表達式,謝謝您的幫助

+1

正則表達式是不解析HTML/XML –

+0

的完美工具謝謝,有什麼想法? –

+2

[除XHTML自包含標籤的正則表達式匹配開放標籤]的可能的複製(http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – joce

回答

1

您可以嘗試使用DOMDocument等作爲

$html = '<td class="t_ip">104.131.248.140</td><td class="t_port">   <script type="text/javascript">   //<![CDATA[    document.write(BigBlind^BigBlind^60088);   //]]>   </script>50088   </td><td class="t_type">  5   </td><td class="t_ip">79.101.32.14</td><td class="t_port">   <script type="text/javascript">   //<![CDATA[    document.write(Polymorth^Polymorth^1080);   //]]>   </script>45080   </td>'; 

$dom = new DOMDocument; 
$dom->loadHTML($html); 
$root = $dom->documentElement; 
$tds = $root->getElementsByTagName("td"); 
foreach($tds as $key => $value){ 
    echo $value->parentNode->textContent."<br>"; 
}