Selenium（Python） - 在完全加載頁面後獲取webdriver的page_source

我必須從動態頁面（其中許多實際上）獲取數據。我可以使用Python中的Selenium訪問該頁面。但是，driver.page_source不完整。即使我嘗試driver.implicitly_wait（100）也沒有改變。Selenium（Python） - 在完全加載頁面後獲取webdriver的page_source

我也試過：

from selenium import webdriver 
from selenium.webdriver.common.by import By 
from selenium.webdriver.support.ui import WebDriverWait # available since 2.4.0 
from selenium.webdriver.support import expected_conditions as EC# available since 2.26.0 

WebDriverWait(driver, 10).until(
      EC.presence_of_element_located((By.LINK_TEXT, "Load all")))

雖然我看到一些等待/暫停，綽綽有餘的頁面加載，我看到了等待後driver.page_source沒有影響。

這裏有解決方案嗎？

謝謝。

來源

2014-04-26 user3171971

你需要什麼的'page_source'？ – alecxe

解決方案是使用其他的東西來抓取頁面源，如果你真的需要它。 Webdrivers'getPageSource將返回某些狀態，其格式爲驅動程序所在的最後一頁的。

從（JAVA）的文檔，但最有可能適用於其他語言：

getPageSource 

java.lang.String getPageSource() 
獲取最後加載頁面的源代碼。如果頁面在加載
（例如，通過Javascript）後被修改，則沒有保證返回的文本是修改頁面的文本。請查閱正在使用的特定驅動程序的文檔確定返回的文本是否反映頁面的當前狀態或Web服務器上次發送的文本。返回的頁面源是底層DOM的表示形式：不要指望它以的格式或轉義方式與從服務器發送的響應相同。把它看作藝術家的印象。
Returns: 
    The source of the current page 

http://selenium.googlecode.com/git/docs/api/java/org/openqa/selenium/WebDriver.html#getPageSource%28%29

來源

2014-04-27 20:15:42

Selenium（Python） - 在完全加載頁面後獲取webdriver的page_source

回答

相關問題