如何使用python和selenium webdriver捕獲https網站數據

我一直在試圖抓取www.zomato.com超過一個星期，現在我通過網絡搜索了我的問題，但是我找不到適當的解決方案。所以我在這裏發佈了我的問題。
如何使用python和selenium webdriver捕獲https網站數據

這是我的webscraper代碼。

from selenium import webdriver 
from selenium.webdriver.common.keys import Keys 
from selenium.webdriver.support.ui import Select 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.common.exceptions import TimeoutException 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.common.exceptions import NoSuchElementException 
from selenium.common.exceptions import NoAlertPresentException 
import sys 
import lxml 
import unittest, time, re 

class Sel(unittest.TestCase): 
    def setUp(self): 
     self.driver = webdriver.PhantomJS(executable_path='\phantomjs.exe')#phantom js 
     self.driver.implicitly_wait(30) 
     self.base_url = "https://www.zomato.com" 
     self.verificationErrors = [] 
     self.accept_next_alert = True 
    def test_sel(self): 
     driver = self.driver 
     delay = 3 
     driver.get(self.base_url + "hyderabad") 
     driver.find_element_by_link_text("All").click() 
     for i in range(1,100): 
      self.driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") 
      time.sleep(4) 
     html_source = driver.page_source 
     data = html_source.encode('utf-8') 


if __name__ == "__main__":

當我在Python 3.4運行，即目錄/ PY -3.4 selenium.py 我得到這個錯誤
selenium-python-phantomJS-SSL。
任何人都可以幫助我解決這個問題嗎？
最好的問候。

來源

2017-01-25 rakesh

粘貼錯誤消息的文本，不要鏈接到截圖。 –

首先，有錯誤的截圖你貼不不來自你發佈的代碼。您的代碼示例顯示您正在調用webdriver.PhantomJS，但屏幕截圖清楚地顯示調用webdriver.Firefox時出現錯誤。

此外，屏幕截圖中的錯誤消息告訴你究竟是什麼問題以及如何解決它：「geckodriver可執行文件需要在PATH中」。

使用帶硒的Firefox。你需要安裝geckodriver並在你的PATH上使用它。 geckodriver（如chromedriver）是一個外部組件，不包含Firefox或Selenium ...它必須單獨安裝。

你可以在這裏下載geckodriver：https://github.com/mozilla/geckodriver/releases

來源

2017-02-03 02:37:11

謝謝你將考慮它。 – rakesh

您需要將相應的接受編碼標頭添加到您的請求中。

'的Accept-Encoding'： 'gzip的，放氣，SDCH，BR'

來源

2017-01-31 03:33:19 Merch

伴侶我很抱歉，但這不像以前一樣工作相同的錯誤 – rakesh

如何使用python和selenium webdriver捕獲https網站數據

回答

相關問題