2017-08-06 94 views
0

現在發現查詢是否'$ NotXP-> query'=查詢返回字符串?!如何在php DOMXPath對象中返回?

如何使下一個代碼工作?

$xp = new \DOMXPath(@\DOMDocument::loadHTMLFile($url)); 

     $list = $xp->query('//table[@class="table-list quality series"] tbody'); 
     $link = $list->query('//tr[@class="item"]'); 

     $arr_links = []; 

     foreach ($link as $link_in_cycle) { 
      $link_quality = $link_in_cycle->query('//td[@class="column first video"]'); 
      $link_audio = $link_in_cycle->query('//td[@class="column audio"]'); 
      $link_size = $link_in_cycle->query('//td[@class="column size"]'); 
      $link_seed = $link_in_cycle->query('//td[@class="column seed-leech"] span[@class="seed"]'); 
      $link_download_url = $link_in_cycle->query('//td[@class="column last download"] a')->getAttribute("data-default"); 

的要求@佰仁

從這個代碼HTML源需要信息的抓取

<tbody> 
             <tr class="item"> 
       <td class="column first video">720x400</td> 
       <td class="column audio">mp3</td> 
       <td class="column size">5.70 Gb</td> 
       <td class="column seed-leech"> 
        <span class="seed">15</span> 
        <span class="leech">26</span> 
       </td> 
       <td class="column updated">07.07.2017</td> 
       <td class="column consistence"><a href="javascript:void(0);" title="title in td" data-type="torrent-consistence" class="show-modal show-consistence" data-route="/hashinfo/12345?fields=files"></a></td> 
       <td class="column last download"> 
       <a class="button middle rounded download zona-link" 
    data-type="download" 
    data-zona="0" 
    data-torrent="" 
    data-default="url_data" 
    data-not-installed="" 
    data-installed="" 
    data-metriks="{'eventType': 'click', 'data' : { 'type': 'show_download', 'id': '84358'}}" 
    title="text in title" href="javascript:void(0);" >Download</a>    </td> 
+0

你能告訴你想什麼來實現的,可能是樣本網址,人們可以使用測試任何可能的解決方案。 –

+0

@NigelRen,添加html源代碼 – ALPHA

回答

2

我做了一些改變,以幫助我在調試代碼。最主要的是你的XPath表達式是無效的,你總是可以嘗試一個像FreeFormatter這樣的站點,它允許你用一些示例源代碼來檢查你的表達式。

$doc = new \DOMDocument(); 
$doc->loadHTMLFile($url); 
$xp = new \DOMXPath($doc); 

$list = $xp->query('//table[@class="table-list quality series"]//tr[@class="item"]'); 
$arr_links = []; 

foreach ($list as $link_in_cycle) { 

    $link_quality = $xp->query('//td[@class="column first video"]/text()', $link_in_cycle)[0]->wholeText; 
    $link_audio = $xp->query('//td[@class="column audio"]/text()', $link_in_cycle)[0]->wholeText; 
    $link_size = $xp->query('//td[@class="column size"]/text()', $link_in_cycle)[0]->wholeText; 
    $link_seed = $xp->query('//td[@class="column seed-leech"]//span[@class="seed"]/text()', $link_in_cycle)[0]->wholeText; 
    $link_download_url = $xp->query('//td[@class="column last download"]//a/@data-default', $link_in_cycle)[0]->value; 

    echo $link_quality.PHP_EOL; 
    echo $link_audio.PHP_EOL; 
    echo $link_size.PHP_EOL; 
    echo $link_seed.PHP_EOL; 
    echo $link_download_url.PHP_EOL; 
} 

的XPath表達式嘗試和檢索每個元素的文本節點,將返回所有節點的列表,這個代碼就假定有不在身邊的實際內容有任何空格(和用途[0 ]來獲取列表的第一個元素)。 wholetext只是DOMText元素的實際內容。

隨着你給的樣品含量(加上週邊位我不得不發明),它給...

720x400 
mp3 
5.70 Gb 
15 
Download 
+0

$ link_download_url需要artibute'data-default' – ALPHA

+0

我已經改變了XPath來獲取屬性。 –