2013-04-15 79 views
0

我正在嘗試編寫一個腳本,它允許我登錄到在zencart上運行的網站的密碼保護區域並以字符串形式抓取html。到目前爲止,它將下載HTML頁面作爲.html文件,但僅下載頁面的公共版本(即未成功登錄)。這是HTTPS,我相信這可能是它不工作的原因的一部分。有什麼不對我的腳本是什麼捲曲的設置,我需要(有上的捲曲的PHP很少文檔可悲的是:()CURL登錄和檢索頁面

<?php 
$username = "[email protected]"; 
$password = "xxxxx"; 
$securityToken = "b6afe5babdd1b6be234d1976586fb1f1"; 
$loginUrl = "https://www.xxxxxxx.com.au/index.php?main_page=login&action=process"; 

//init curl 
$ch = curl_init(); 

//Set the URL to work with 
curl_setopt($ch, CURLOPT_URL, $loginUrl); 

// ENABLE HTTP POST 
curl_setopt($ch, CURLOPT_POST, 1); 

//Set the post parameters 
curl_setopt($ch, CURLOPT_POSTFIELDS, 'email_address='.$username.'&password='.$password.'&securityToken='.$securityToken); 

//Handle cookies for the login 
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie122.txt'); 

//Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL 
//not to print out the results of its query. 
//Instead, it will return the results as a string return value 
//from curl_exec() instead of the usual true/false. 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 

//execute the request (the login) 
$store = curl_exec($ch); 

//the login is now done and you can continue to get the 
//protected content. 

//set the URL to the protected file 
curl_setopt($ch, CURLOPT_URL, 'http://www.xxxxxxx.com.au/index.php?main_page=product_info&products_id=1488'); 

//execute the request 
$content = curl_exec($ch); 

//save the data to disk 
file_put_contents('test122.html', $content); 

if(curl_error($ch)) { 
$error = curl_error($ch); 
echo $error; 
} 


?> 

回答

0

嘗試不同的網址和數據。使用http_build_query()。

$loginUrl = "https://www.xxxxxxx.com.au/index.php"; 
... 
$args = array(
    'main_page'  => 'login', 
    'action'  => 'process', 
    'email_address' => $username, 
    'password'  => $password, 
    'securityToken' => $securityToken 
); 
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($args)); 

而且你可能需要:

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); 

而且可能是有用的:

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false); 
+0

感謝,但沒有運氣 – Melbourne2991

0

完整的示例工作:

$ch = curl_init($url); 

    $headers = array(); 

    //post 
    $post = 'a=b&d=c'; 
    $headers[] = 'Content-type: application/x-www-form-urlencoded;charset=utf-8'; 
    $headers[] = 'Content-Length: ' . strlen($post); 

    curl_setopt($ch, CURLOPT_POST, 1); 
    curl_setopt($ch, CURLOPT_POSTFIELDS, $post); 

    //if get 
    // $headers[] = 'Content-type: charset=utf-8'; 

    $headers[] = 'Connection: Keep-Alive'; 

    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);// Follow redirects 
    curl_setopt($ch, CURLOPT_AUTOREFERER, 1);// Set referer on redirect 
    curl_setopt($ch, CURLOPT_MAXREDIRS, 5);// Stop after x redirects 
    curl_setopt($ch, CURLOPT_TIMEOUT, 20);// Stop after x seconds 

    //username and password (if any): 
    curl_setopt($ch, CURLOPT_USERPWD, "$username:$password"); 
    curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC); 

    //https 
    $https = strpos($url, "https://"); 
    if($https !== false){ 
     curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
    } 

    $response = curl_exec($ch); 
    curl_close($ch); 

    echo($response); 
+0

感謝虐待給它一個去,現在 – Melbourne2991

+0

它沒有我重定向到登錄頁面,出現以下錯誤: 有嘗試登錄時安全錯誤。 – Melbourne2991

0

你必須遵循一些步驟與HTTPS站點

Export the certificate.

轉到在瀏覽器中的網站(FF在我的例子)登錄。雙擊「鎖定」圖標>安全性>查看證書>詳細信息>導出。保存爲X.509。

Upload it to where your script can see it.

Then add

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2); 
curl_setopt($ch, CURLOPT_CAINFO, getcwd() . "/path/to/certificateCA.crt"); 
+0

argg仍然無法正常工作..如果沒有人提到這個,雖然,我注意到我得到的答案是多種多樣的並且相隔很遠 – Melbourne2991

+0

這裏是一些使用https登錄並訪問登錄頁面http://www.herikstad的示例。 net/2011/06/logging-to-https-websites-using-php.html – liyakat

+0

很酷的感謝,我注意到「securityToken」實際上發生了變化,這是否意味着我需要編寫腳本,以便在捲曲之前抓住安全令牌它的業務? – Melbourne2991