使用PHP來檢測所有鏈接（包括那些不去文件）

我試圖檢測斷開的鏈接。以下PHP訪問MySQL表似乎爲幾乎所有的工作帶來極大的（但速度緩慢由於的fopen）：使用PHP來檢測所有鏈接（包括那些不去文件）

function fileExists($path){  
    return (@fopen($path,"r")==true); 
} 
$status=""; 
$result = mysql_query(" SELECT id, title, link from table "); 
while ($row = mysql_fetch_array($result)) { 
    $id=$row{'id'}; 
    $title=$row{'title'}; 
    $link1=$row{'link1'}; 
    etc. 
    if ($link){ 
     if (fileExists($link)!=TRUE) { 
      $status='BROKEN_LINK'; 
     } 
    } 
    //Here do something if the status gets set to broken 
}

但問題是，像這樣的鏈接：

torrentfreak.com/unblocking-the-pirate -bay-the-hard-way-is-fun-for-geeks-120506

在這裏，它不會去一個文件，但去某個地方，並獲得內容。那麼，當他們不在自己的域名上時，正確檢測這些情況的最佳方法是什麼？

謝謝！

Mordak

來源

2012-07-08 Mordak

，捲曲下載它，並檢查HTTP響應頭（404）。 – 2012-07-08 20:22:20

他爲什麼需要下載整個頁面。只需獲取CURL鏈接的標題並檢查它們的404即可。 – tftd 2012-07-08 20:24:34

查看PHP cURL庫：[鏈接到PHP cURL庫手冊頁]（http://php.net/manual/en/book.curl。 PHP） – Stegrex 2012-07-08 20:32:22

您可以嘗試使用捲曲方法：

function fileExists(&$pageScrape, $path){ // Adding parameter of cURL resource as a pointer. 
    curl_setopt($pageScrape, CURLOPT_URL, $path); // Set URL path. 
    curl_setopt($pageScrape, CURLOPT_RETURNTRANSFER, true); // Don't output the scraped page directly. 
    curl_exec($pageScrape); // Execute cURL call. 
    $status = curl_getinfo($pageScrape, CURLINFO_HTTP_CODE); // Get the HTTP status code of the page, load into variable $status. 
    if ($status >= 200 && $status <= 299) { // Checking for the page success. 
     return true; 
    } else { 
     return false; 
    } 
} 

$pageScrape = curl_init(); 

$status=""; 
$result = mysql_query(" SELECT id, title, link from table "); 
while ($row = mysql_fetch_array($result)) { 
    $id=$row{'id'}; 
    $title=$row{'title'}; 
    $link1=$row{'link1'}; 
    etc. 
    if ($link){ 
     if (fileExists($pageScrape, $link)!=TRUE) { 
      $status='BROKEN_LINK'; 
     } 
    } 
    //Here do something if the status gets set to broken 
} 
curl_close($pageScrape);

您可以通過查看HTTP狀態代碼列表微調狀態檢查：Wikipedia link

來源

2012-07-08 20:42:09 Stegrex

使用PHP來檢測所有鏈接（包括那些不去文件）

回答

相關問題