希望你想要的頁面上課程的鏈接,該代碼將有助於
from selenium import webdriver
from bs4 import BeautifulSoup
import time
baseurl='https://www.udemy.com'
url="https://www.udemy.com/courses/search/?q=sql&src=ukw&lang=en"
driver = webdriver.Chrome()
driver.maximize_window()
driver.get(url)
time.sleep(5)
content = driver.page_source.encode('utf-8').strip()
soup = BeautifulSoup(content,"html.parser")
courseLink = soup.findAll("a", {"class": "card__title",'href': True})
for link in courseLink:
print baseurl+link['href']
driver.quit()
它會打印:
https://www.udemy.com/the-complete-sql-bootcamp/
https://www.udemy.com/the-complete-oracle-sql-certification-course/
https://www.udemy.com/introduction-to-sql23/
https://www.udemy.com/oracle-sql-12c-become-an-sql-developer-with-subtitle/
https://www.udemy.com/sql-advanced/
https://www.udemy.com/sql-for-newbs/
https://www.udemy.com/sql-for-marketers-data-analytics-data-science-big-data/
https://www.udemy.com/sql-for-punk-analytics/
https://www.udemy.com/sql-basics-for-beginners/
https://www.udemy.com/oracle-sql-step-by-step-approach/
https://www.udemy.com/microsoft-sql-for-beginners/
https://www.udemy.com/sql-tutorial-learn-sql-with-mysql-database-beginner2expert/
您希望獲取的鏈接是什麼?任何例子,我嘗試你的代碼,它給了我很多的HTML字符串回來,它似乎是好的 – linpingta
道歉,我不是很清楚。我喜歡課程鏈接,所以... https://www.udemy .com/the-complete-sql-bootcamp/etc – Krishn
是的,我看到...不幸的是,該網站是動態生成的,這意味着它通過使用ajax生成部分html內容禁止簡單的網絡抓取。你可以嘗試CasperJs,它模擬用戶訪問而不是爬蟲,我有一個Facebook粉絲頁面的例子,https://github.com/linpingta/facebook-related/tree/master/facebook-fan-page-fetcher – linpingta