2013-07-02 237 views
2

我正在嘗試使用php curl登錄ets.org/toefl帳戶。但我無法登錄到網站。我通常會遇到一個錯誤,說服務器很忙,但是當我使用瀏覽器登錄時,它會工作。我附上我的代碼。任何人都可以看到什麼是錯的?PHP Curl登錄https

<?php 
include('simple_html_dom.php'); 

$login_url = 'https://toefl-registration.ets.org/TOEFLWeb/logon.do'; 

$username='****'; 
$password='***'; 
$ck = 'cookie.txt'; 

$agent = 'Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20100101 Firefox/22.0'; 
// extra headers 
$headers[] = "Connection: keep-alive"; 
//$headers[]= "Accept-Encoding: gzip, deflate"; 


$ch = curl_init(); 

curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);   
curl_setopt($ch, CURLOPT_USERAGENT, $agent); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 

curl_setopt($ch, CURLOPT_COOKIEJAR, $ck); 
curl_setopt ($ch, CURLOPT_COOKIEFILE, $ck); 

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
//curl_setopt($ch, CURLOPT_URL, 'https://toefl-registration.ets.org/TOEFLWebextISERLogonPrompt.do'); 

$output = curl_exec($ch); 
//echo $output; 

$html = new simple_html_dom(); 
$html = str_get_html($output); 
$e = $html->find(".loginform"); 
$a = $e[0]->find('input'); 
$str = $a[0]->outertext; 
preg_match("/value=\"(.*)\"/",$str,$match); 
$h_attr = $match[1]; 

$fields['org.apache.struts.taglib.html.TOKEN'] = $h_attr; 
$fields['currentLocale']= 'en_US'; 
$fields['username'] = $username; 
$fields['password'] = $password; 
$fields['x'] = 11; 
$fields['y'] = 4; 
//print_r($fields); 
//echo "\r\n"; 
$POSTFIELDS = http_build_query($fields); 
//echo $POSTFIELDS; 

$headers[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"; 
$headers[] = "Accept-Language: en-US,en;q=0.5"; 
$headers[]="Referer: https://toefl-registration.ets.org/TOEFLWeb/extISERLogonPrompt.do"; 

curl_setopt($ch, CURLOPT_URL, $login_url); 
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); 
curl_setopt($ch, CURLOPT_VERBOSE, true); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $POSTFIELDS); 
$result = curl_exec($ch); 
print $result; 

(更新從評論)

發表的瀏覽器:

org.apache.struts.taglib.html.TOKEN = c1b88957e9914492fe8cc20b33ef1cdd & currentLoca樂= EN_US & username = name & password = pass & x = 23 & y = 3 由我。 org.apache.struts.taglib.html.TOKEN = 345a9f935b2db8a69f55c5b4d3372190 & currentLoca勒= EN_US &用戶名=名&密碼=傳遞& X = 11 & Y = 4

發表的PHP捲曲冗長產生:

POST /TOEFLWeb/logon.do HTTP/1.1的User-Agent:Mozilla的/ 5.0(Windows NT的 6.1; RV:22.0)壁虎/ 20100101火狐/ 22.0主機:toefl-registration.ets.org曲奇:au = MTM3Mjc4ODQwMg%3d%3d;服務器= 3; JSESSIONID = 23C39022E2641B8F5AC944295837315E連接:keep-alive 接受:/接受: text/html,application/xhtml + xml,application/xml; q = 0.9,/; q = 0.8 Accept-Language:zh-cn,en; q = 0.5的Referer: toefl-registration.ets.org/TOEFLWeb/extISERLogonPrompt.do 的Content-Length:134 Content-Type的:應用/ X WWW的窗體-urlencoded

+0

您是否需要在第二個curl請求上設置cookie jar? – chrislondon

+0

也是你的cookie文件可寫嗎?我可以看到它造成的問題,如果它不是 – chrislondon

+0

我認爲他們試圖用標記保護他們的形式,原因是...如果我是你,我應該確保你在這裏做的是與網站的TOS兼容。 – CBroe

回答

0

我弄懂了它的工作......我向代碼中添加了證書驗證。此外,我發現在兩個函數獲取cookie和登錄之間需要存在一些延遲。工作代碼低於

<?php 
include('simple_html_dom.php'); 

$login_url = 'https://toefl-registration.ets.org/TOEFLWeb/logon.do'; 
$cookie_page = 'https://toefl-registration.ets.org/TOEFLWeb/extISERLogonPrompt.do'; 

$username='******'; 
$password='******'; 

//$ck = 'E:\Projects\Web Development\toefl_script\cookie.txt'; 
$ck = 'D:\Nikhil\Projects\Wamp\toeflscript\cookie.txt'; 

//$agent = 'Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20100101 Firefox/22.0'; 
$agent = 'Mozilla/5.0 (Windows NT 6.1; rv:21.0) Gecko/20100101 Firefox/21.0'; 

$headers[] = "Connection: keep-alive"; 
$headers[] = "Accept: */*"; 


/* Begin Program Execution */ 

init_curl(); 
get_cookie(); 
sleep(30); 
login(); 

function get_cookie() 
{ 
    global $ch, $ck, $h_attr, $headers, $cookie_page; 
    global $ck; 

    curl_setopt($ch, CURLOPT_URL, $cookie_page); 

    //curl_setopt($ch, CURLOPT_VERBOSE, true); 
    $output = curl_exec($ch); 
    //echo $output; 

    /* 
    $html = new simple_html_dom(); 
    $html = str_get_html($output); 
    $e = $html->find(".loginform"); 
    $a = $e[0]->find('input'); 
    $str = $a[0]->outertext; 
    preg_match("/value=\"(.*)\"/",$str,$match); 
    $h_attr = $match[1]; 
    */ 
} 

function init_curl() 
{ 
    global $ch, $ck, $h_attr, $headers, $agent; 
    global $ck; 

    ini_set('max_execution_time', 300); 

    $ch = curl_init(); 

    curl_setopt($ch, CURLOPT_HEADER, 0); 
    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); 

    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true); 
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2); 

    curl_setopt($ch, CURLOPT_CAINFO, getcwd() . '/cacert.pem'); 

    curl_setopt($ch, CURLOPT_USERAGENT, $agent); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 

    curl_setopt($ch, CURLOPT_COOKIEJAR, $ck); 
    curl_setopt ($ch, CURLOPT_COOKIEFILE, $ck); 
} 

function login() 
{ 
    global $ch, $login_url, $password, $username, $ck, $h_attr, $headers; 

    //$fields['org.apache.struts.taglib.html.TOKEN'] = 'abc';//$h_attr; 
    $fields['currentLocale']= 'en_US'; 
    $fields['username'] = $username; 
    $fields['password'] = $password; 
    $fields['x'] = 11; 
    $fields['y'] = 4; 

    $POSTFIELDS = http_build_query($fields); 
    //print_r($fields); 
    //echo $POSTFIELDS; 

    $headers[] = "Accept-Language: en-US,en;q=0.5"; 
    $headers[]="Referer: https://toefl-registration.ets.org/TOEFLWeb/extISERLogonPrompt.do"; 

    curl_setopt($ch, CURLOPT_URL, $login_url); 
    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); 
    curl_setopt($ch, CURLOPT_VERBOSE, true); 
    curl_setopt($ch, CURLOPT_POST, 1); 
    curl_setopt($ch, CURLOPT_POSTFIELDS, $POSTFIELDS); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 
    $result = curl_exec($ch); 
    print $result; 
} 
2

嘗試比較發送的HTTP報頭通過你的CURL腳本發送到你的瀏覽器發送的頭文件(使用chrome開發工具)。也許遠程服務器由於缺少標題信息而拒絕你。

確保cookie文件具有完全權限。從php.net:

當specifing CURLOPT_COOKIEFILE或CURLOPT_COOKIEJAR選項,不要 忘記 「搭配chmod 777」 該目錄必須在其中 創建的cookie文件。

+0

爲什麼投下來。我認爲這種方法可以揭示這種形式是通過某種參數保護的事實。 – beiller

+0

嘿Lonewolf你可以添加你的腳本構建的POST嗎?你認爲x和y從哪裏來?你確定你正確地捕捉到TOKEN嗎? – beiller

+0

@beiller權限合適 – Lonewolf