2015-10-15 40 views
1

我試圖抓取一個網頁,但我無法獲取使用硒的網站的html文本。BeautifulSoup不會使用硒獲取頁面源

這裏是我到目前爲止的代碼

from selenium import webdriver 
from selenium.webdriver.common.keys import Keys 
from selenium.webdriver.common.by import By 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
from bs4 import BeautifulSoup 
import urlparse 

search_term = raw_input("What is your search term?: ") 
url = "https://www.google.co.uk/search?client=ubuntu&channel=fs&q=" 
googurl = url+search_term 
driver = webdriver.Firefox() 

htmltext = driver.get(googurl) 
soup = BeautifulSoup(htmltext.page_source) 

這樣做我得到的回溯

What is your search term?: hi 
Traceback (most recent call last): 
    File "google page click.py", line 15, in <module> 
    soup = BeautifulSoup(htmltext.page_source) 
AttributeError: 'NoneType' object has no attribute 'page_source' 

回答

1

始終要使用的驅動程序對象:

driver.get(googurl) 
soup = BeautifulSoup(driver.page_source) 
+0

三江源這一點,它的現在工作。 – booberz

相關問題