使用scrapy獲取「下一頁」數據

我需要抓取商品網站的評論數據，但它的用戶數據是分頁的。每頁評論有10條，約有100頁。我如何抓取所有這些？使用scrapy獲取「下一頁」數據

My intention is to use the yield and Request method to crawl the "Next Page" link, and then using the Xpath to extract data. But I can't jump to the next page to extract the data.

這裏是關於「下一頁」鏈接的HTML代碼：

<div class="xs-pagebar clearfix"> 
    <div class="Pagecon"> 
      <div class="Pagenum"> 
       <a class="pre-page pre-disable"> 
       <a class="pre-page pre-disable"> 
       <span class="curpage">1</span> 
       <a href="#" onclick="tosubmits(2):return false;">2</a> 
       <a href="#" onclick="tosubmits(3);return false;">3</a> 
       <span class="elli">...</span> 
       <a href="#" class="next-page" onclick="tosubmits('2');return false;">Next Page</a> 
       <a href="#" onclick="tosubmits('94');return false;">Final Page</a> 
      </div> 
    </div> 
</div>

是什麼href="#"究竟意味着什麼呢？

來源

2014-11-06 samlong

不幸的是，你不能用scrapy做到這一點。 href="#"是一個鏈接無處不在的錨鏈接（使其看起來像鏈接）。真正發生的是執行的javascript onclick處理程序。你將需要一個執行javascript的方法來爲你的用例做這件事。你可能想看看Splinter來做到這一點。

來源

2014-11-06 14:17:27

謝謝你的解釋。至於那，你是否知道任何其他方法來完成這項工作？我已經堆積了好幾天了。 – samlong 2014-11-06 14:29:10

正如我所說，你可以使用分裂或查看鉻開發工具，看看JavaScript調用什麼：http://stackoverflow.com/questions/8550114/can-scrapy-be-used-to-scrape-dynamic- content-from-websites-that-are-using-ajax – 2014-11-06 14:44:25

非常感謝！通過使用分裂，我解決了這個問題！分裂是解決動態網頁問題的有力工具，我非常喜歡它！ – samlong 2014-11-09 12:37:30

使用scrapy獲取「下一頁」數據

回答

相關問題