我需要抓取需要登錄的網站。我正在嘗試創建一個session
並登錄,因爲登錄後必須抓取不同的頁面。但無法找出爲什麼它不起作用。python-requests無法登錄到網站
import requests
from bs4 import BeautifulSoup
login_data = {
"log":"login",
"login":"my email",
"password":"my password"
}
session = requests.session()
session.post(login_url, data=login_data)
response = session.get(url)
html = response.text
soup = BeautifulSoup(html, "html.parser")
print(soup.title.get_text())
標題顯示它不工作。
這是網站的形式。
<form method="post" id="signin-form" class="form-horizontal">
<input type="hidden" name="referer" value="" />
<div class="form-group">
<label for="email_text" class="col-sm-4 control-label">Your login (email):</label>
<div class="col-sm-8">
<input type="email" class="form-control" id="email_text" value="" name="login" autofocus
data-validation='{"parent":".form-group","events":["keyup","blur"],"rules":[{"name":"notblank"},{"name":"email"}]}' />
</div>
</div>
<div class="form-group">
<label for="password_text" class="col-sm-4 control-label">Password:</label>
<div class="col-sm-8">
<input type="password" class="form-control" id="password_text" name="password"
data-validation='{"parent":".form-group","rules":[{"name":"min","min":5}]}' />
</div>
</div>
<div class="form-group">
<div class="col-sm-8 col-sm-offset-4">
<div class="checkbox">
<label>
<input type="checkbox" name="rememberme"> Remember me on this computer
</label>
</div>
</div>
</div>
<div class="form-group">
<div class="col-sm-offset-4 col-sm-8">
<button type="submit" class="btn btn-default btn-lg" name="log">Log into your account</button>
<a class="btn btn-default btn-lg mobile-show-inline-block" href="/account/create/">Create account</a>
<a href="/account/lostpassword" class="btn btn-link btn-lg">Forgot your password?</a>
</div>
</div>
</form>
N.B:不建議我使用selenium
。我可以用selenium
來做到這一點,我測試過,但我必須堅持requests
,因爲即使我使用PhantomJS
,selenium
也會彈出控制檯。
請先嚐試登錄登錄頁面。也許它會設置一些預計會出現在帖子中的cookies。 –
@JohnGordon哇!這樣可行。請發佈它作爲答案。 –