使用Python請求和美麗的湯拉文

感謝您抽出寶貴看看我的問題。我想知道是否有任何方式拉從這個文本數據sitekey ...這裏是鏈接到頁面https://e-com.secure.force.com/adidasUSContact/使用Python請求和美麗的湯拉文

<div class="g-recaptcha" data-sitekey="6LfI8hoTAAAAAMax5_MTl3N-5bDxVNdQ6Gx6BcKX" data-type="image" id="ncaptchaRecaptchaId"><div style="width: 304px; height: 78px;"><div><iframe src="https://www.google.com/recaptcha/api2/anchor?k=6LfI8hoTAAAAAMax5_MTl3N-5bDxVNdQ6Gx6BcKX&amp;co=aHR0cHM6Ly9lLWNvbS5zZWN1cmUuZm9yY2UuY29tOjQ0Mw..&amp;hl=en&amp;type=image&amp;v=r20160921114513&amp;size=normal&amp;cb=ei2ddcb6rl03" title="recaptcha widget" width="304" height="78" role="presentation" frameborder="0" scrolling="no" name="undefined"></iframe></div><textarea id="g-recaptcha-response" name="g-recaptcha-response" class="g-recaptcha-response" style="width: 250px; height: 40px; border: 1px solid #c1c1c1; margin: 10px 25px; padding: 0px; resize: none; display: none; "></t

這裏是我當前的代碼

import requests 
from bs4 import BeautifulSoup 

headers = { 
    'Host' : 'e-com.secure.force.com', 
    'Connection' : 'keep-alive', 
    'Upgrade-Insecure-Requests' : '1', 
    'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; WOW64)', 
    'Accept' : 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 
    'Accept-Encoding' : 'gzip, deflate, sdch', 
    'Accept-Language' : 'en-US,en;q=0.8' 
} 
url = 'https://e-com.secure.force.com/adidasUSContact/' 
r = requests.get(url, headers=headers) 
soup = BeautifulSoup(r, 'html.parser') 
c = soup.find_all('div', attrs={"class": "data-sitekey"}) 
print c

來源

2016-09-28 Tony sanchez

哪裏是你的代碼，使遠嗎？有百分之百的方法，但很高興看到你的努力到目前爲止。 –

@PadraicCunningham更新 –

@jonrsharpe更新 –

好現在我們的代碼，它是那樣簡單：

import requests 
from bs4 import BeautifulSoup 


soup = BeautifulSoup(requests.get("https://e-com.secure.force.com/adidasUSContact/").content, "html.parser") 

key = soup.select_one("#ncaptchaRecaptchaId")["data-sitekey"]

數據sitekey是屬性，不是 a css類，所以你只需要從元素中提取它，你可以通過它找到它的元素id如上。

你也可以使用類名：

# css selector 
key = soup.select_one("div.g-recaptcha")["data-sitekey"] 
# regular find using class name 
key = soup.find("div",class_="g-recaptcha")["data-sitekey"]

來源

2016-09-28 21:21:41

好的，讓我試試這個。 –

@Tonysanchez，肯定會工作;） –

工作，但其現在告訴我一些關於標記 –

使用Python請求和美麗的湯拉文

回答

相關問題