2015-11-12 58 views
1

當我刮表時,表tr和td值正在改變。下面是原始表格。如何使用簡單的html dom查找特定數據php

<table class="scoretable"> 
<tbody> 
<tr><td class="jdhead">Name</td><td class="fullhead">John</td></tr> 
<tr><td class="jdhead">Age</td><td class="fullhead">30</td></tr> 
<tr><td class="jdhead">Phone</td><td class="fullhead">91234988788</td></tr> 
<tr><td class="jdhead">Location</td><td class="fullhead">Madrid</td></tr> 
<tr><td class="jdhead">Country</td><td class="fullhead">Spain</td></tr> 
<tr><td class="jdhead">Role</td><td class="fullhead">Manager</td></tr> 
</tbody> 
</table> 

<table class="scoretable"> 
<tbody> 
<tr><td class="jdhead">Name</td><td class="fullhead">John</td></tr> 
<tr><td class="jdhead">Age</td><td class="fullhead">30</td></tr> 
<tr><td class="jdhead">Phone</td><td class="fullhead">91234988788</td></tr> 
<tr><td class="jdhead">Role</td><td class="fullhead">Manager</td></tr> 
</tbody> 
</table> 

以上兩張表來自不同的頁面。我需要刮名稱,電話和角色。

$url = "http://name.com/listings"; 
$html = file_get_html($url); 

$posts1 = $html->find('td[class=fullhead]',1); 

foreach ($posts1 as $post1) { 
    $poster1 = $post1->outertext; 
    echo $poster1; 
    } 

回答

1

我想嘗試preg_match所需的值從這樣的HTML:

<?php 
$url = 'http://name.com/listings'; 
$html = file_get_contents($url); 

if (preg_match('~<tr><td class="jdhead">Name</td><td class="fullhead">([^<]*)</td></tr>~', $html, $matches)) { 
    echo $matches[1]; // here is you name 
} 

if (preg_match('~<tr><td class="jdhead">Phone</td><td class="fullhead">([^<]*)</td></tr>~', $html, $matches)) { 
    echo $matches[1]; // here is you phone 
} 

if (preg_match('~<tr><td class="jdhead">Role</td><td class="fullhead">([^<]*)</td></tr>~', $html, $matches)) { 
    echo $matches[1]; // here is you role 
} 

更新(見下面的註釋):

<?php 
$url = 'http://jobsearch.naukri.com/job-listings-010915006292'; 
$html = file_get_contents($url); 

if (preg_match('~<TR VALIGN="top"> <TD CLASS="jdHead">Job Posted </TD> <TD VALIGN="top" CLASS="detailJob">([^<]*)</TD> </TR>~', $html, $matches)) { 
    echo 'Job Posted: ' . $matches[1] . '<br><br>'; 
} 


if (preg_match('~<TR VALIGN="top"> <TD CLASS="jdHead">Job Description</TD> <TD VALIGN="top" CLASS="detailJob">(.*?)</TD> </TR>~', $html, $matches)) { 
    echo 'Job Description: ' . $matches[1] . '<br><br>'; 
} 
+0

不工作先生。它沒有顯示任何數據。 –

+0

比你的例子數據不正確:http://sandbox.onlinephpfunctions.com/code/769680775e33474a74b10b60c74c13673c843a9a –

+0

先生,我從網址報廢。表格不在我的頁面中。我已經看過上面的頁面。 –

0

我有這樣的解決方案,與您一起工作例如:

<?php 
// load 
$doc = new DOMDocument(); 
$doc->loadHTMLFile("tabledata.html"); 

// required nodes 
$required_data = ['Name', 'Phone', 'Role']; 

$tbody_elements = $doc->getElementsByTagName('tbody'); 

// xpath object 
$xpath = new DOMXPath($doc); 

// array for final data 
$finaldata = []; 
// each tr is one user 
foreach($tbody_elements as $key => $tbody) 
{ 
    // iterate though the required data 
    foreach($required_data as $data) 
    { 
     $return = $xpath->query("tr[td[text()='$data']]", $tbody); 

     foreach($return as $node) 
     { 
      $finaldata[$key][$data] = $node->textContent; 
     } 
    } 
} 

輸出:

array(2) { 
    [0]=> 
    array(3) { 
    ["Name"]=> 
    string(8) "NameJohn" 
    ["Phone"]=> 
    string(16) "Phone91234988788" 
    ["Role"]=> 
    string(11) "RoleManager" 
    } 
    [1]=> 
    array(3) { 
    ["Name"]=> 
    string(8) "NameJohn" 
    ["Phone"]=> 
    string(16) "Phone91234988788" 
    ["Role"]=> 
    string(11) "RoleManager" 
    } 
} 
+0

感謝您的代碼先生。 –