selenium，webdriver.page_source不刷新後點擊

我想複製一個給定的社區服務的地址的網頁列表到一個新的文檔，所以我可以地理編碼地圖中的所有位置。我只能一次下載一個包裹，而不能獲得所有包裹的列表，並且有25個包裹編號僅限於一個頁面。因此，這將非常耗時。selenium，webdriver.page_source不刷新後點擊

我想開發一個腳本來查看頁面源代碼（包括25個地址，包含在表標籤中的所有內容），單擊下一頁按鈕，複製下一頁，直到最大頁面爲止到達。之後，我可以將文本格式設置爲兼容地理編碼。

下面的代碼做這一切，除了它僅複製的第一頁上，即使我可以清楚地看到，該方案已經成功地導航到下一個頁面了一遍：

# Open chrome 
br = webdriver.Chrome() 

raw_input("Navigate to web page. Press enter when done: ") 

pg_src = br.page_source.encode("utf") 
soup = BeautifulSoup(pg_src) 

max_page = 122 #int(max_page) 

#open a text doc to write the results to 

f = open(r'C:\Geocoding\results.txt', 'w') 

# write results page by page until max page number is reached 

pg_cnt = 1 # start on 1 as we should already have the first page 
while pg_cnt < max_page: 
    tble_elems = soup.findAll('table') 
    soup = BeautifulSoup(str(tble_elems)) 
    f.write(str(soup)) 
    time.sleep(5) 
    pg_cnt +=1 
    # clicks the next button 
    br.find_element_by_xpath("//div[@class='next button']").click() 
    # give some time for the page to load 
    time.sleep(5) 
    # get the new page source (THIS IS THE PART THAT DOESN'T SEEM TO BE WORKING) 
    page_src = br.page_source.encode("utf") 
    soup = BeautifulSoup(pg_src) 

f.close()

來源

2017-06-28 ShaunO

聲明'BR = webdriver.Chrome（）'你，你做出來的湯之前，不要將頁面加載到瀏覽器後與BeautifulSoup的頁面內容。 –

瀏覽器打開後，我導航到頁面。我在原始文章中排除了代碼的raw_input部分。它現在在那裏。 – ShaunO

代碼在哪裏？ –

我面臨同樣的問題。我認爲這個問題是因爲一些JavaScript沒有完全加載。所有你需要的是等待，直到對象是loaded.Below代碼工作對我來說

from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.common.by import By 
     delay = 10 # seconds 
     try: 
      myElem = WebDriverWait(drivr, delay).until(EC.presence_of_element_located((By.CLASS_NAME, 'legal-attribute-row'))) 
     except : 
      print ("Loading took too much time!")

來源

2018-01-28 03:50:47

selenium，webdriver.page_source不刷新後點擊

回答

相關問題