0
我想寫一個腳本,將解析我的大學時間表,並將其保存到光盤。我用捲曲來做。主要鏈接時間表here,如果在瀏覽器中打開它,我能看到的內容,但如果我嘗試在捲曲打開它,我有失敗=(捲曲機器人不工作寫
下面是PHP腳本的源代碼:
<?
$url = "http://cist.kture.kharkov.ua/ias/app/tt/f?p=778:201:128623920522090:::201:P201_FIRST_DATE,P201_LAST_DATE,P201_GROUP,P201_POTOK:01.02.2012,30.07.2012,2423461,0:";
$ch = curl_init();
$cookieFile = tempnam (dirname(__FILE__) . "/cookies/", 'cookie-');
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieFile);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieFile);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$ua = "Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.26 Safari/535.11";
//$headers = array('HOST: cist.kture.kharkov.ua','CONNECTION: keep-alive','CACHE_CONTROL: max-age=0','USER_AGENT: Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.26 Safari/535.11','ACCEPT: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8','REFERER: http://google.com.ua', 'ACCEPT_ENCODING: gzip,deflate,sdch','ACCEPT_LANGUAGE: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4','ACCEPT_CHARSET: windows-1251,utf-8;q=0.7,*;q=0.3');
//curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
//curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
//curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_USERAGENT, $ua);
curl_setopt($ch, CURLOPT_URL, $url);
$data = curl_exec($ch);
$info = curl_getinfo($ch);
$counter = 0;
while($info['redirect_url']!= "")
{
echo "url => ". $url."<br />\n";
echo "redirect => ". $info['redirect_url']."<br /><br />\n";
curl_setopt($ch, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_URL, $info['redirect_url']);
$url = $info['redirect_url'];
$data = curl_exec($ch);
$info = curl_getinfo($ch);
$counter++;
if($counter>100)
break;
}
foreach ($info as $key => $value) {
echo $key . " -> ".$value."<br />\n";
}
$html = htmlspecialchars($data);
echo "<pre>$html</pre>";
echo $cont;?>
結果我有一個空白頁:(請幫我。
在我的腳本中存在循環重定向url後面的內容,它遵循兩次並停止,但是$ data是空的=( – user1437607
)您是對的,它會繼續重定向。也許它與Cookie有關服務器發送。嘗試發送該cookie以及後續請求,看看是否有任何區別。 – Petter