Scrapy使用Scrapy和硒的網站

我要刮 http://ntry.com/#/scores/named_ladder/main.php的html內容與Scrapy。Scrapy使用Scrapy和硒的網站

但是，由於該網站的Javascript使用和＃，我想我必須使用 Selenium（Python）也。

我想寫我自己的代碼，但我是編程新手，所以我想我需要幫助;

我想先進入ntry.com，並通過點擊錨移動到http://ntry.com/#/scores/named_ladder/main.php稱爲

<body> 
    <div id="wrap"> 
     <div id="container"> 
      <div id="content"> 
       <a href="/scores/named_ladder/main.php">사다리</a> 
      </div> 
     </div> 
    </div> 
</body>

，然後我想刮使用Scrapy更改的頁面上HTMLS。

我該如何製作一個selenium -blended Scrapy蜘蛛？

來源

2016-11-26 이해주

的可能的複製[硒與scrapy動態頁面（http://stackoverflow.com/questions/17975471/selenium-with-scrapy-for-dynamic-頁面） – damienc

我安裝了Selenium，然後加載了PhantomJS模塊，它工作完美。

這裏是你可以嘗試

from selenium import webdriver 
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities 

class FormSpider(Spider): 
    name = "form" 

    def __init__(self): 

     dcap = dict(DesiredCapabilities.PHANTOMJS) 
     dcap["phantomjs.page.settings.userAgent"] = ("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.122 Safari/537.36") 

     self.driver = webdriver.PhantomJS(desired_capabilities=dcap, service_args=['--ignore-ssl-errors=true', '--ssl-protocol=any', '--web-security=false']) 
     self.driver.set_window_size(1366,768) 


    def parse_page(self, response): 
      self.driver.get(response.url) 
      cookies_list = self.driver.get_cookies()

來源

2016-11-27 20:37:39 Umair

你將不得不編寫你自己的'start_requests'方法......我跳過了它。 – Umair

Scrapy使用Scrapy和硒的網站

回答

相關問題