Python的履帶：處理「負載更多」按鈕

我學習Python的爬蟲，我想知道如何應對「負載更多」按鈕位於以下網址：Python的履帶：處理「負載更多」按鈕

https://www.photo.net/search/#//Sort-View-Count/All-Categories/All-Time/Page-1

（我試圖抓取所有圖片）

當前的代碼，我使用beautifulsoup：

from urllib.request import * 

from http.cookiejar import CookieJar 

from bs4 import BeautifulSoup 

url = 'https://www.photo.net/search/#//Sort-View-Count/All-Categories/All- Time/Page-1' 

cj = CookieJar() 

opener = build_opener(HTTPCookieProcessor(cj)) 

try: 
    p = opener.open(url) 

    soup = BeautifulSoup(p, 'html.parser') 

except Exception as e: 

    print(str(e))

來源

2017-06-05 Harry

一個例子，你是否嘗試過在年底增加頁碼網址？你可以嘗試循環瀏覽一些頁面並且颳去那裏的內容。 – briansrls

是的，我試過了，但它不能被加載，除非你點擊那個按鈕。 – Harry

嗯，我有你的解決方案。

你應該嘗試用於python的Selenium模塊。

1）通過PIP

這裏下載Chrome的驅動

2）安裝硒是如何使用它

from selenium import webdriver 
from selenium.webdriver.support.wait import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.common.by import By 

browser = webdriver.Chrome('Path to chrome driver') 
browser.get() 
while True: 
    button = WebDriverWait(browser,10).until(EC.presence_of_element_located((By.LINK_TEXT, 'Load More'))) 
    button.click()

來源

2017-09-27 15:26:57

Python的履帶：處理「負載更多」按鈕

回答

相關問題