2012-05-21 28 views
0

我想從關聯數組中的網站刮取html代碼。 我試過用Zend_Dom_Query。Zend_Dom_Query獲取數組

例子:

<div class="job"> 
    <div class="jobTitle"> 
    <a href="http://website.com/Job-Title-1">Job-Title-1</a> 
    </div> 
    <div class="company"> 
    <a href="http://website.com/Company-1">Company-1</a> 
    </div> 
    <div class="city"> 
    <a href="http://website.com/City-1">City-1</a> 
    </div> 
</div> 
<div class="job"> 
    <div class="jobTitle"> 
    <a href="http://website.com/Job-Title-2">Job-Title-2</a> 
    </div> 
    <div class="company"> 
     <a href="http://website.com/Company-2">Company-2</a> 
    </div> 
    <div class="city"> 
     <a href="http://website.com/City-2">City-2</a> 
    </div> 
</div> 

我如何從上面的html獲得關聯數組?

$dom = new Zend_Dom_Query($html); 
$links = $dom->query('div.jobTitle a'); 
$companies = $dom->query('div.company'); 
$cities = $dom->query('div.city'); 

//result needed 
$result_array = array(array(link => 'http://website.com/Job-Title-1', 
     Company => 'Company-1', 
     City => 'City-1' 
     ), 
     array(link => 'http://website.com/Job-Title-2', 
     Company => 'Company-2', 
     City => 'City-2' 
     ) 
    ); 
+2

'Zend_Dom_Query'只是圍繞PHP的原生DOM擴展的包裝,所以你必須使用DOM API的DOMElements轉換在'Zend_Dom_Query_Result'你的陣列。 – Gordon

回答

0
$dom=new Zend_Dom_Query($html); 
    $links=$dom->query('div.jobTitle a'); 
    $companies=$dom->query('div.company'); 
    $cities=$dom->query('div.city'); 

     $data=[]; 
    foreach ($links as $link){ 
     $data[]=[ 
      'link'=> $link->getAttribute('href'), 
      'Company'=>trim($companies->current()->textContent), 
      'City'=>trim($cities->current()->textContent) 
      ]; 
     $companies->next(); 
     $cities->next(); 
    } 
    var_dump($data);