爲什麼在這個YouTube爬蟲中沒有輸出顯示在python中？

import requests 
from bs4 import BeautifulSoup 

youtube = "https://www.youtube.com/results?search_query=" 

def get_address(keyword): 
    query = youtube + keyword 
    source_code = requests.get(query) 
    plain_text = source_code.text 
    soup = BeautifulSoup(plain_text, "html.parser") 

    for link in soup.findAll('a', {'id': 'video-title'}): 
     href = link.get('href') 
     print(href) 
     break 

get_address("scishow")

程序運行成功，但不是顯示視頻的地址，它在輸出中沒有顯示任何內容。爲什麼在這個YouTube爬蟲中沒有輸出顯示在python中？

來源

2017-09-03 user8058054

你的代碼很好，它沒有在輸出中顯示任何內容的唯一原因是因爲你正在尋找的'a'標籤不存在。稍後使用JavaScript將''添加到頁面中，當您檢索初始HTML代碼時，這當然沒有執行。 – pacha

很可能是因爲該頁面正在使用'JS'。在這種情況下，請求將無用，而是使用'selenium'。 –

Youtube在JavaScript上運行嚴重。我建議你使用硒。這裏是你的更新代碼：

from selenium import webdriver 
from bs4 import BeautifulSoup 

youtube = "https://www.youtube.com/results?search_query=" 

def get_address(keyword): 
    query = youtube + keyword 
    browser = webdriver.Chrome() 
    browser.get(query) 
    plain_text = browser.page_source 
    browser.quit() 
    soup = BeautifulSoup(plain_text, "html.parser") 

    for link in soup.findAll('a', {'id': 'video-title'}): 
     href = link.get('href') 
     print(href) 

get_address("scishow")

來源

2017-09-03 10:47:18 chad

爲什麼在這個YouTube爬蟲中沒有輸出顯示在python中？

回答

相關問題