剛開始使用Scrapy時,我希望能夠朝正確的方向輕推。Scrapy正在運行的結果
我想從這裏抽取數據:
https://www.sportstats.ca/display-results.xhtml?raceid=29360
這是我到目前爲止有:
import scrapy
import re
class BlogSpider(scrapy.Spider):
name = 'sportstats'
start_urls = ['https://www.sportstats.ca/display-results.xhtml?raceid=29360']
def parse(self, response):
headings = []
results = []
tables = response.xpath('//table')
headings = list(tables[0].xpath('thead/tr/th/span/span/text()').extract())
rows = tables[0].xpath('tbody/tr[contains(@class, "ui-widget-content ui-datatable")]')
for row in rows:
result = []
tds = row.xpath('td')
for td in enumerate(tds):
if headings[td[0]].lower() == 'comp.':
content = None
elif headings[td[0]].lower() == 'view':
content = None
elif headings[td[0]].lower() == 'name':
content = td[1].xpath('span/a/text()').extract()[0]
else:
try:
content = td[1].xpath('span/text()').extract()[0]
except:
content = None
result.append(content)
results.append(result)
for result in results:
print(result)
現在我需要移動到下一個頁面,我可以在瀏覽器中單擊底部的「右箭頭」,我相信它是以下li:
<li><a id="mainForm:j_idt369" href="#" class="ui-commandlink ui-widget fa fa-angle-right" onclick="PrimeFaces.ab({s:"mainForm:j_idt369",p:"mainForm",u:"mainForm:result_table mainForm:pageNav mainForm:eventAthleteDetailsDialog",onco:function(xhr,status,args){hideDetails('athlete-popup');showDetails('event-popup');scrollToTopOfElement('mainForm\\:result_table');;}});return false;"></a>
我該如何獲得scrapy才能遵循這一點?
增加了主要職位的當前進度。 – user3449833
這是一個javascript渲染問題,如果您使用firefox檢查涉及的請求,或者最終使用[Splash](https://github.com/scrapinghub/splash)等一些JavaScript呈現服務,我會推薦使用firebug。或硒。 – eLRuLL