1
我想抓取這個頁面的所有exibitors動態頁面:使用scrapy與硒
https://greenbuildexpo.com/Attendee/Expohall/Exhibitors
但scrapy不會加載我在做什麼現在用硒加載它的內容頁面和搜索與scrapy鏈接:
url = 'https://greenbuildexpo.com/Attendee/Expohall/Exhibitors'
driver_1 = webdriver.Firefox()
driver_1.get(url)
content = driver_1.page_source
response = TextResponse(url='',body=content,encoding='utf-8')
print len(set(response.xpath('//*[contains(@href,"Attendee/")]//@href').extract()))
該網站似乎並沒有做出任何新的請求時,「下一個」按鈕被按下,所以我希望得到所有鏈接的一個,但我只是很與該代碼獲得43個鏈接。他們應該是在500左右。
現在我想按「下一步」按鈕抓取網頁:
for i in range(10):
xpath = '//*[@id="pagingNormalView"]/ul/li[15]'
driver_1.find_element_by_xpath(xpath).click()
,但我得到了一個錯誤:
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 192, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: {"method":"xpath","selector":"//*[@id=\"pagingNormalView\"]/ul/li[15]"}
Stacktrace: