2012-01-24 32 views
17

我有這個代碼用於使用Simple DOM Parser和curl登錄到Google。我試着添加cookiejar文件,但無濟於事。我不斷收到消息:使用PHP和Curl登錄Google,Cookie關閉了嗎?

您瀏覽器的Cookie功能已關閉。請打開它。

關於如何解決這個問題的任何想法?

這裏是我的代碼,以供參考:

$html = file_get_html('https://accounts.google.com/ServiceLogin?hl=en&service=alerts&continue=http://www.google.com/alerts/manage'); 

//... some code for getting post data here 

$curl_connection = curl_init('https://accounts.google.com/ServiceLoginAuth'); 
curl_setopt($curl_connection, CURLOPT_CONNECTTIMEOUT, 30); 
curl_setopt($curl_connection, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"); 
curl_setopt($curl_connection, CURLOPT_RETURNTRANSFER, true); 
curl_setopt($curl_connection, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($curl_connection, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt($curl_connection, CURLOPT_COOKIEJAR, COOKIEJAR); 
curl_setopt($curl_connection, CURLOPT_COOKIEFILE, COOKIEJAR); 
curl_setopt($curl_connection, CURLOPT_HEADER, true); 
curl_setopt($curl_connection, CURLOPT_RETURNTRANSFER,1); 
curl_setopt($curl_connection, CURLOPT_CONNECTTIMEOUT, 120); 
curl_setopt($curl_connection, CURLOPT_TIMEOUT, 120); 
curl_setopt($curl_connection, CURLOPT_POSTFIELDS, $post_string); 

$result = curl_exec($curl_connection); 
curl_close($curl_connection); 

echo $result; 
+0

你或許應該取使用捲曲,而不是'file_get_html'功能的URL,因爲它可能設置了一些餅乾的身份驗證服務可能會尋找形式。另外,您是否可以確認由'COOKIEJAR'指定的文件正在創建幷包含Cookie? – drew010

+0

我檢查了COOKIEJAR文件,它裏面包含一些文本。我還將curl_init url設置爲與file_get_html相同的url,仍然是同樣的東西,對我來說沒有cookie。 :( – kazuo

+0

我確實在這裏得到了一些頭文件嗎?它們是:HTTP/1.1 200 OK Set-Cookie:GoogleAccountsLocale_session = en;安全設置Cookie:GAPS = 1:ZuuFm50cJM2_fiqQc38hkyuCjZXRRg:bMuhAssScKIBtI1L; Path = /; Expires =星期四,23-Jan-2014 18:32:24 GMT;安全; HttpOnly內容類型:text/html; charset = UTF-8 Strict-Transport-Security:max-age = 2592000; includeSubDomains日期:2012年1月24日18:32:24 GMT截止日期:2012年1月24日18:32:24 GMT Cache-Control:private,max-age = 0 X-Content-Type-Options:nosniff X-XSS-Protection:1; mode = block內容長度:1848服務器:GSE – kazuo

回答

26

下面是一些修改後的代碼工程。

它首先請求登錄頁面獲取初始cookie並提取登錄表單所需的值。接下來,它執行登錄服務的帖子。然後檢查它是否嘗試使用JavaScript和元標記重定向到目標URL。

看起來你已經有了抓取表單字段的代碼,所以我沒有發佈我的,但如果你需要它,讓我知道。只要確保$formFields是一個關聯數組,鍵是字段名稱,值是字段值。

<?php 

/** 
* Log in to Google account and go to account page 
* 
*/ 

$USERNAME = '[email protected]'; 
$PASSWORD = 'password'; 
$COOKIEFILE = 'cookies.txt'; 

// initialize curl handle used for all requests 
$ch = curl_init(); 

// set some options on the handle 
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30); 
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:49.0) Gecko/20100101 Firefox/49.0"); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt($ch, CURLOPT_COOKIEJAR, $COOKIEFILE); 
curl_setopt($ch, CURLOPT_COOKIEFILE, $COOKIEFILE); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); 
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 120); 
curl_setopt($ch, CURLOPT_TIMEOUT, 120); 

// url of our first request fetches the account login page 
curl_setopt($ch, CURLOPT_URL, 
    'https://accounts.google.com/ServiceLogin?hl=en&service=alerts&continue=http://www.google.com/alerts/manage'); 
$data = curl_exec($ch); 

// extract form fields from account login page 
$formFields = getFormFields($data); 

// inject email and password into form 
$formFields['Email'] = $USERNAME; 
$formFields['Passwd'] = $PASSWORD; 
unset($formFields['PersistentCookie']); 

$post_string = http_build_query($formFields); // build urlencoded POST string for login 

// set url to login page as a POST request 
curl_setopt($ch, CURLOPT_URL, 'https://accounts.google.com/ServiceLoginAuth'); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string); 

// execute login request 
$result = curl_exec($ch); 

// check for "Redirecting" message in title to indicate success 
// based on your language - you may need to change this to match some other string 
if (strpos($result, '<title>Redirecting') === false) { 
    die("Login failed"); 
    var_dump($result); 
} 

// login likely succeeded - request account page; unset POST so we do a regular GET 
curl_setopt($ch, CURLOPT_URL, 'https://myaccount.google.com/?utm_source=OGB'); 
curl_setopt($ch, CURLOPT_POST, 0); 
curl_setopt($ch, CURLOPT_POSTFIELDS, null); 

// execute request for login page using our cookies 
$result = curl_exec($ch); 

echo $result; 


// helpef functions below 

// find google "#gaia_loginform" for logging in 
function getFormFields($data) 
{ 
    if (preg_match('/(<form.*?id=.?gaia_loginform.*?<\/form>)/is', $data, $matches)) { 
     $inputs = getInputs($matches[1]); 

     return $inputs; 
    } else { 
     die('didnt find login form'); 
    } 
} 

// extract all <input fields from a form 
function getInputs($form) 
{ 
    $inputs = array(); 

    $elements = preg_match_all('/(<input[^>]+>)/is', $form, $matches); 

    if ($elements > 0) { 
     for($i = 0; $i < $elements; $i++) { 
      $el = preg_replace('/\s{2,}/', ' ', $matches[1][$i]); 

      if (preg_match('/name=(?:["\'])?([^"\'\s]*)/i', $el, $name)) { 
       $name = $name[1]; 
       $value = ''; 

       if (preg_match('/value=(?:["\'])?([^"\'\s]*)/i', $el, $value)) { 
        $value = $value[1]; 
       } 

       $inputs[$name] = $value; 
      } 
     } 
    } 

    return $inputs; 
} 
+0

哇!謝謝你!我試了一下,但我得到一個登錄失敗。我的post_data數組到您的formFields數組中。以下是字符串:continue = http%3A%2F%2Fwww.google.com%2Falerts%2Fmanage&service = alerts&dsh = -6553802846829809996&hl = en&GALX = Cg4X gqEmZ_w&timeStmp =&secTok =&Email = xxxxxxxx&Passwd = xxxxxxxxxxx&signIn = Sign + in&rmShown = 1 失敗後,沒有其他輸出。 – kazuo

+0

沒關係,我明白了,謝謝!:D我會試着去看看你的工作是什麼,我的工作是什麼。 – kazuo

+0

它看起來沒問題,假設它被傳遞給了正確的捲髮變量。我只是使用抓取隱藏字段的完整版本更新了代碼。在那裏輸入你的用戶名和密碼,看看它是否適合你。我只是再次證實了整個例子的作品。如果登錄失敗,它應該var_dump生成的網頁。 – drew010