2012-09-12 14 views
0

我有一個傳遞了url數組的functon。每個網頁都會有一系列指向其他頁面的鏈接。我想從傳遞給此函數的每個網頁返回這些鏈接的完整列表。我被困在如何在每個循環中組合數組。使用QueryPath從多個頁面收集鏈接

function getitemurls ($pagelinks) { 
global $host; 
foreach($pagelinks as $link) { 
    $circdl = my_curl($link); 
    $circqp = htmlqp($circdl,'body'); 
    $circlinks = array(); 
    foreach ($circqp->branch()->top('area[href]') as $item) { 
    $circlinks[] = $item->attr('href'); 
    } 
    for ($i = 0; $i < count($circlinks); ++$i) { 
    $fullitemurl = join(array($host,$circlinks[$i])); 
    } 
    } 
    return $fullitemurl; 
} 

例如:

Webpage 1: page1.html 
<html><body><area shape="rect" href="http://www.google.com" coords="110,151,173,225" alt=""/></body></html> 

Webpage 2: page2.html 
     <html><body><area shape="rect" href="http://www.yahoo.com" coords="110,151,173,225" alt=""/></body></html> 

這裏是兩頁的數組:

$array = array (
"0" => "page1.html", 
"1" => "page2.html",); 

從這個數組我想回:

getitemurls($array) 
Array ([0] => http://www.google.com [1] => http://www.yahoo.com) 
+0

想出來:只需在循環之前聲明$ fullitemurl作爲數組。現在很好用! –

回答

0

我最後只是在循環之前聲明我的數組,然後是在循環中對其進行簽名:

function getitemurls ($pagelinks) { 
    global $host; 
    $fullitemurls = array(); 
    foreach($pagelinks as $link) { 
    $circdl = my_curl($link); 
    $circqp = htmlqp($circdl,'body'); 
    $circlinks = array(); 
    foreach ($circqp->branch()->top('area[href]') as $item) { 
    $circlinks[] = $item->attr('href'); 
    } 
    for ($i = 0; $i < count($circlinks); ++$i) { 
    $fullitemurl[] = join(array($host,$circlinks[$i])); 
    } 
    } 
return $fullitemurl; 
}