我想繞過一些PHP網頁抓取使用cURL的思想。我最近拿起了一本關於這個主題的小書,但是我停留在其中一個教程上,似乎無法找到錯誤的位置。 cookie.txt文件被創建,所以我知道該函數的至少一部分正在正確執行。PHP網絡抓取教程失敗
我試過使用名稱和密碼input
字段的ID和名稱屬性沒有任何運氣。據我所知,我也使用正確的POST網址。
<?php
// Function to submit form using cURL POST method
function curlPost($postUrl, $postFields, $successString) {
$useragent = 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3'; // Setting using agent of a very old, yet popular browser.
$cookie = 'cookie.txt'; //Setting a cookie file to store cookie
$ch = curl_init(); // Intializing cURL session
// Setting cURL options
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); // Prevent cURL from verifying SSL certificate
curl_setopt($ch, CURLOPT_FAILONERROR, TRUE); // Script should fail silently on error
curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE); // Use cookies
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE); // Follow Location: headers
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Reutrning transfer as a string
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie); // Setting cookiefile
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie); // Setting cookiejar
curl_setopt($ch, CURLOPT_USERAGENT, $useragent); // Setting useragent
curl_setopt($ch, CURLOPT_URL, $postUrl); // Setting URL to POST
curl_setopt($ch, CURLOPT_POST, TRUE); // Setting method as POST
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($postFields)); // Setting POST fields as array
$results = curl_exec($ch); // Executing cURL session
curl_close($ch); // Closing cURL session
// Checking if login was successful by checking existence of string
if (strpos($results, $successString)) {
return $results;
} else {
return FALSE;
}
}
$userEmail = '[email protected]'; // Setting your email address for site login
$userPass = 'password'; // Setting your password for site login
$postUrl = 'https://www.packtpub.com/'; // Setting URL to POST to
// Setting form input fields as 'name' => 'value'
$postFields = array (
'name' => $userEmail,
'password' => $userPass,
'form_id' => 'packt-login-form-header'
);
$successString = 'You are logged in as';
$loggedIn = curlPost($postUrl, $postFields, $successString); // Executing curlPost login and storing results page in $loggedIn
?>
'CURLOPT_COOKIEFILE' /'CURLOPT_COOKIEJAR'選項必須設置爲絕對路徑值。 '「cookie.txt」是一個相對路徑。 – hindmost