2010-04-22 101 views
10

這裏是我當前的代碼:如何使用cURL與PHP同時打開多個URL?

$SQL = mysql_query("SELECT url FROM urls") or die(mysql_error()); //Query the urls table 
while($resultSet = mysql_fetch_array($SQL)){ //Put all the urls into one variable 

       // Now for some cURL to run it. 
      $ch = curl_init($resultSet['url']); //load the urls 
      curl_setopt($ch, CURLOPT_TIMEOUT, 2); //No need to wait for it to load. Execute it and go. 
      curl_exec($ch); //Execute 
      curl_close($ch); //Close it off 
     } //While loop 

我是比較新的捲曲。相對較新,我的意思是這是我第一次使用cURL。目前它加載一個兩秒,然後加載下一個2秒,然後下一個。不過,我想讓它在同一時間加載它們。我相信它是可能的,我只是不確定如何。如果有人能指引我正確的方向,我會很感激。

+0

您是否需要對捲曲加載的結果執行任何操作? – 2010-04-22 16:37:59

+0

沒有。 – Rob 2010-04-22 16:39:06

回答

8

您以相同的方式設置每個cURL句柄,然後將它們添加到curl_multi_句柄。要查看的功能是curl_multi_*函數documented here。不過,根據我的經驗,嘗試一次加載過多網址時存在問題(儘管目前我無法在其上找到我的筆記),所以上次我使用curl_mutli_時,我將其設置爲分批每次5個網址。

編輯:這是我一直在使用curl_multi_代碼的簡化版本:

編輯:略改寫和大量的補充意見,希望這將有助於。

// -- create all the individual cURL handles and set their options 
$curl_handles = array(); 
foreach ($urls as $url) { 
    $curl_handles[$url] = curl_init(); 
    curl_setopt($curl_handles[$url], CURLOPT_URL, $url); 
    // set other curl options here 
} 

// -- start going through the cURL handles and running them 
$curl_multi_handle = curl_multi_init(); 

$i = 0; // count where we are in the list so we can break up the runs into smaller blocks 
$block = array(); // to accumulate the curl_handles for each group we'll run simultaneously 

foreach ($curl_handles as $a_curl_handle) { 
    $i++; // increment the position-counter 

    // add the handle to the curl_multi_handle and to our tracking "block" 
    curl_multi_add_handle($curl_multi_handle, $a_curl_handle); 
    $block[] = $a_curl_handle; 

    // -- check to see if we've got a "full block" to run or if we're at the end of out list of handles 
    if (($i % BLOCK_SIZE == 0) or ($i == count($curl_handles))) { 
     // -- run the block 

     $running = NULL; 
     do { 
      // track the previous loop's number of handles still running so we can tell if it changes 
      $running_before = $running; 

      // run the block or check on the running block and get the number of sites still running in $running 
      curl_multi_exec($curl_multi_handle, $running); 

      // if the number of sites still running changed, print out a message with the number of sites that are still running. 
      if ($running != $running_before) { 
       echo("Waiting for $running sites to finish...\n"); 
      } 
     } while ($running > 0); 

     // -- once the number still running is 0, curl_multi_ is done, so check the results 
     foreach ($block as $handle) { 
      // HTTP response code 
      $code = curl_getinfo($handle, CURLINFO_HTTP_CODE); 

      // cURL error number 
      $curl_errno = curl_errno($handle); 

      // cURL error message 
      $curl_error = curl_error($handle); 

      // output if there was an error 
      if ($curl_error) { 
       echo(" *** cURL error: ($curl_errno) $curl_error\n"); 
      } 

      // remove the (used) handle from the curl_multi_handle 
      curl_multi_remove_handle($curl_multi_handle, $handle); 
     } 

     // reset the block to empty, since we've run its curl_handles 
     $block = array(); 
    } 
} 

// close the curl_multi_handle once we're done 
curl_multi_close($curl_multi_handle); 

既然你什麼都不需要從後面的網址,你也許並不需要很多東西的存在,但我這是怎麼分塊請求進入BLOCK_SIZE塊,等待每塊在繼續之前運行,並從cURL中捕獲錯誤。

+0

好吧,我要做的就是加載每個網址(以及它將加載的網址是空白頁,訪問網址只啓動一個腳本並使其運行預設時間),而不是保存或輸出任何數據。你認爲這會造成這種情況下的任何問題嗎? – Rob 2010-04-22 16:45:17

+0

我的猜測是,在這種情況下它不會出現問題,但我不確定 - 如果嘗試一次加載所有這些文件時無法運行或發生錯誤,則可以將計數器在你的'while'循環中,並且每當循環內部有'counter%batch_size == 0'時,運行批處理並清除它。 – Isaac 2010-04-22 18:11:12

+0

哇。討厭用這個打擾你,但是你能否在該代碼中評論一些東西,以便我可以看到所有事情都做了什麼? – Rob 2010-04-22 18:39:56