2016-08-23 25 views
0

我想寫一個Python腳本,去一個YouTube頻道。點擊視頻標籤。然後刮掉網頁的內容。我對所有內容都很滿意,直到我加載更多按鈕。我已經設法使用python腳本來點擊加載更多按鈕一次,但它再也不會再按下它:'(我怎樣才能修改代碼,我必須一次又一次地點擊它,直到它不存在爲止,這樣我就可以打通用戶的完整的通道,並從他們每個視頻信息。三江源。使用硒點擊一個'加載更多'按鈕,直到它不存在(Youtube)

from selenium import webdriver  
from selenium.common.exceptions import NoSuchElementException 
from selenium.webdriver.common.by import By 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 

^^這些都是我所導入的模塊。我不知道如何在NoSuchElementException異常添加到我的代碼,通過這裏是代碼:

chrome_path = r"/Users/jack/Desktop/Other/Downloads/Software_and_Programs/chromedriver" 
browser = webdriver.Chrome(chrome_path) 
YOUTUBER_HOME_PAGE_URL = "https://www.youtube.com/user/Google/videos" 
PATIENCE_TIME = 60 
LOAD_MORE_BUTTON_XPATH = '//*[@id="browse-itemsprimary"]/li[2]/button/span/span[2]' 

def waitForLoad(inputXPath): 
    Wait = WebDriverWait(browser, PATIENCE_TIME) 
    Wait.until(EC.presence_of_element_located((By.XPATH, inputXPath))) 

loadMoreButtonExists = True 
while loadMoreButtonExists: 
    try: 
     waitForLoad(LOAD_MORE_BUTTON_XPATH) 
     WebDriverWait(browser, PATIENCE_TIME) 
     loadMoreButton = browser.find_element_by_partial_link_text('Load More') 
     #loadMoreButton = browser.find_element_by_xpath(LOAD_MORE_BUTTON_XPATH) 
     loadMoreButton.click() 
    except: 
     print 'we have completely loaded every video from this Youtuber. Now we will scrape the video content\n' 
     loadMoreButtonExists = False 

我已經使用的XPath方式仍似乎沒有工作,我已經在上面的代碼中註釋掉這將是如此aweso。如果我能在這方面得到幫助的話。我一直沒有找到任何好的答案。我相信這可以用硒來完成,但是如果不是,我應該用什麼?

回答

3

你可以用下面的代碼滾動

from selenium import webdriver  
from selenium.common.exceptions import NoSuchElementException 
from selenium.webdriver.common.by import By 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
import time 


#browser = webdriver.Firefox()#Chrome('./chromedriver.exe') 
YOUTUBER_HOME_PAGE_URL = "https://www.youtube.com/user/Google/videos" 
PATIENCE_TIME = 60 
LOAD_MORE_BUTTON_XPATH = '//*[@id="browse-itemsprimary"]/li[2]/button/span/span[2]' 

driver = webdriver.Chrome('./chromedriver.exe') 
driver.get(YOUTUBER_HOME_PAGE_URL) 

while True: 
    try: 
     loadMoreButton = driver.find_element_by_xpath("//button[contains(@aria-label,'Load more')]") 
     time.sleep(2) 
     loadMoreButton.click() 
     time.sleep(5) 
    except Exception as e: 
     print e 
     break 
print "Complete" 
time.sleep(10) 
driver.quit() 
+0

感謝它解決了我的問題 – knownUnknown