本網站似乎期待一個名爲AspxAutoDetectCookieSupport
的cookie,如果它沒有找到它,它會將您重定向到某個cookie檢測頁面,它會卡在一個循環中:
> curl -I -L http://maxhire.net/cp/?EA5E6F361D4364703D044F72
HTTP/1.1 302 Found
Date: Fri, 23 Aug 2013 23:10:55 GMT
Server: Microsoft-IIS/6.0
P3P: CP="CAO PSA OUR"
X-Powered-By: ASP.NET
X-AspNet-Version: 4.0.30319
Location: /cp/?EA5E6F361D4364703D044F72&AspxAutoDetectCookieSupport=1
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 180
Connection: Keep-Alive
Set-Cookie: AspxAutoDetectCookieSupport=1; path=/
HTTP/1.1 302 Found
Date: Fri, 23 Aug 2013 23:10:56 GMT
Server: Microsoft-IIS/6.0
P3P: CP="CAO PSA OUR"
X-Powered-By: ASP.NET
X-AspNet-Version: 4.0.30319
Location: /cp/?EA5E6F361D4364703D044F72&AspxAutoDetectCookieSupport=1
&AspxAutoDetectCookieSupport=1
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 214
Connection: Keep-Alive
Set-Cookie: AspxAutoDetectCookieSupport=1; path=/
HTTP/1.1 302 Found
Date: Fri, 23 Aug 2013 23:10:57 GMT
Server: Microsoft-IIS/6.0
P3P: CP="CAO PSA OUR"
X-Powered-By: ASP.NET
X-AspNet-Version: 4.0.30319
Location: /cp/?EA5E6F361D4364703D044F72&AspxAutoDetectCookieSupport=1
&AspxAutoDetectCookieSupport=1&AspxAutoDetectCookieSupport=1
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 248
Connection: Keep-Alive
Set-Cookie: AspxAutoDetectCookieSupport=1; path=/
^C
所以你需要設置這個cookie:AspxAutoDetectCookieSupport=1
:
curl_setopt($ch, CURLOPT_COOKIE, 'AspxAutoDetectCookieSupport=1');
這解決了第一個問題,另一個問題來了,如果你沒有設置用戶代理它將發送一個值你這個頁面:
<html xmlns:atom="http://www.w3.org/2005/Atom">
<head><meta http-equiv="Content-Type" content="text/xml; charset=iso-8859-1" /><
title>
Untitled Page
</title><link href="App_Themes/Default/Common.css" type="text/css" rel="styleshe
et" /><link href="App_Themes/Default/Container.css" type="text/css" rel="stylesh
eet" /><link href="App_Themes/Default/Content.css" type="text/css" rel="styleshe
et" /><link href="App_Themes/Default/Login.css" type="text/css" rel="stylesheet"
/></head>
<body>
<form name="form1" method="post" action="rssCurrentJobs.aspx?site=5E6F361D43
64703D044F72" id="form1">
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKMTc2MTg4
NDc4NmRk" />
<div>
</div>
</form>
</body>
</html>
所以添加一個用戶代理值:
curl_setopt($ch, CURLOPT_USERAGENT, "SomeUserAgent");
全碼:
function url_get_contents ($Url) {
if (!function_exists('curl_init')){
die('CURL is not installed!');
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $Url);
curl_setopt($ch, CURLOPT_POST, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "SomeUserAgent");
curl_setopt($ch, CURLOPT_COOKIE, 'AspxAutoDetectCookieSupport=1');
$output = curl_exec($ch);
curl_close($ch);
return $output;
}
技術上只需要'的file_get_contents(的 'http:// ....')',假設PHP的'allow_url_fopen'啓用。很可能該Feed正在進行UA和/或referer過濾,因此您必須更好地假裝成爲常規瀏覽器。 –
嗨,感謝您的評論 file_get_contents是第一個選項,但即使allow_url_fopen設置true也不適用於這個網址:-( 你的暗示是一個普通的瀏覽器「給了我幾個想法,像用戶代理...」但當然,如果有人知道解決方案更受歡迎!!我是客戶端的人;-) – user2712273