php超時與file_get_html

我一直試圖通過使用simple_html_dom lib for php從維基網站獲取一些數據。基本上我所做的就是使用wikia api轉換成html呈現並從那裏提取數據。解壓後，我會將這些數據抽取到mysql數據庫中進行保存。我的問題是，通常我會拉300條記錄，我會卡住93個記錄file_get_html爲空，這將導致我的find（）函數失敗。我不知道爲什麼會停在93分的記錄，但我已經嘗試了各種解決方案，如php超時與file_get_html

ini_set('default_socket_timeout', 120); 
    set_time_limit(120);

基本上我將不得不訪問維基頁面300次得到那些300條記錄。但大多數情況下，我會設法在file_get_html變爲null之前獲得93條記錄。任何想法如何解決這個問題？

我也測試卷曲以及具有相同的問題。

function test($url){ 
$ch=curl_init(); 
$timeout=5; 

curl_setopt($ch, CURLOPT_URL, $url); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); 

$result=curl_exec($ch); 
curl_close($ch); 
return $result; 
} 

$baseurl = 'http://xxxx.wikia.com/index.php?'; 

foreach($resultset_wiki as $name){ 
    // Create DOM from URL or file 
$options = array("action"=>"render","title"=>$name['name']); 
$baseurl .= http_build_query($options,'','&'); 
$html = file_get_html($baseurl); 
if($html === FALSE) { 
echo "issue here"; 
} 
    // this code for cURL but commented for testing with file_get_html instead 
    $a = test($baseurl); 
    $html = new simple_html_dom(); 
    $html->load($a); 

    // find div stuff here and mysql data pumping here. 
}

$ resultsetwiki是與標題的列表中的陣列，以從取維基，基本上resultsetwiki數據集是從負載DB以及執行搜索之前。

實際上我將這種類型的錯誤

Call to a member function find() on a non-object in

來源

2014-12-02 user1897151

您是否嘗試全部使用'curl'？ – Ghost 2014-12-02 07:16:10

是的，我做過了，但是我仍然可以得到與第93張唱片上的null問題相同的結果。就像沒有使用捲曲一樣。 – user1897151 2014-12-02 07:17:40

網站是不是隻限制你，因爲你在很短的時間內發出大量的電話給他們？ – Erik 2014-12-02 07:25:47

回答我自己的問題，似乎是我使用的URL，我已經改變了與後捲曲後的動作和標題參數，而不是

來源

2014-12-03 06:37:48 user1897151

php超時與file_get_html

回答

相關問題