我使用的是Splash 2.0.2 + Scrapy 1.0.5 + Scrapyjs 0.1.1
,我仍然無法通過點擊呈現JavaScript。下面是一個例子網址https://olx.pt/anuncio/loja-nova-com-250m2-garagem-em-box-fechada-para-arrumos-IDyTzAT.html#c49d3d94cfScrapy + Splash + ScrapyJS
我仍然沒有得到電話號碼的頁面渲染:
class OlxSpider(scrapy.Spider):
name = "olx"
rotate_user_agent = True
allowed_domains = ["olx.pt"]
start_urls = [
"https://olx.pt/imoveis/"
]
def parse(self, response):
script = """
function main(splash)
splash:go(splash.args.url)
splash:runjs('document.getElementById("contact_methods").getElementsByTagName("span")[1].click();')
splash:wait(0.5)
return splash:html()
end
"""
for href in response.css('.link.linkWithHash.detailsLink::attr(href)'):
url = response.urljoin(href.extract())
yield scrapy.Request(url, callback=self.parse_house_contents, meta={
'splash': {
'args': {'lua_source': script},
'endpoint': 'execute',
}
})
for next_page in response.css('.pager .br3.brc8::attr(href)'):
url = response.urljoin(next_page.extract())
yield scrapy.Request(url, self.parse)
def parse_house_contents(self, response):
import ipdb;ipdb.set_trace()
我怎樣才能得到這個工作?
我真的需要這個工作,因爲我會@ psychok7你肯定scrapyjs就足以被移動到更復雜的JS站點,日期選擇器日曆和東西 – psychok7
爲您的複雜動態網站?也許切換到'硒'會讓事情變得更快,更簡單.. – alecxe
我試了一下..我不知道如果它是可能的或不..但我會考慮硒以及謝謝 – psychok7