RSelenium webscrape

2015-11-26 80 views 0 likes

我想刮一個網站，但是JavaScript導致我的問題。我使用RSelenium進入我想要的頁面，並給出html我可以解析它並獲取我想要的數據。然而，這是我似乎無法解決的一步。下面是我有：RSelenium webscrape

library('RSelenium') 
checkForServer() 
startServer() 
remDr <- remoteDriver(browserName="firefox", port=4444) 
remDr$open(silent=T) 
library('XML') 
url <- "http://racing.hkjc.com/racing/Info/Meeting/Results/english/Local/20141012/ST/1" 
remDr$navigate(url) 
elem <- remDr$findElement(using="div id", value="results") # PROBLEM HERE, CAN'T FIND A TAG THAT WORKS! 
elemtxt <- elem$getElementAttribute("outerHTML")[[1]] # possible continuation 
elemxml <- htmlTreeParse(elemtxt, useInternalNodes=T)

（我最頁面上的數據後：結果表中，信息只是它上面，股息表和比賽事故報告，但我知道怎麼去說一旦我有elemxml）

非常感謝

來源

2015-11-26 Jimmy

回答

喜歡的東西：

doc <- htmlParse(remDr$getPageSource()[[1]]) 
readHTMLTable(doc)

應允許您訪問HTML和處理表包含。

來源

2016-11-01 02:09:28

相關問題

1. Webscrape W/Rselenium和Rvest下拉框中其中id改變
2. Python，BeautifulZoup，Selenium webscrape
3. 的Python - 硒 - Webscrape表
4. RSelenium ZipException錯誤
5. 使用RSelenium
6. RSelenium不連接
7. RSelenium和Javascript
8. RSelenium的問題
9. 以使用RSelenium
10. 填表不Rselenium

11. Rselenium Jsonlite刮
12. 錯誤在RSelenium
13. RSelenium：遍歷所有值在投寄箱
14. Webscrape沒有美麗的湯
15. 使用Python/Selenium的Webscrape Flashscore
16. GoDaddy的Ruby webscrape腳本
17. 的Python - 硒 - webscrape的xmlns表
18. 並行運行RSelenium
19. 文檔完成RSelenium
20. RSelenium - java.lang.IllegalStateException中的R
21. 的XPath RSelenium的值
22. 使用phantomjs與Rselenium的Sendkeys
23. 用RSelenium下載Excel文件
24. RSelenium中的滾動頁面
25. 使用RSelenium登錄網站
26. 選擇文本框與RSelenium
27. http認證使用Rselenium/PhantomJS
28. Rselenium網頁抓取問題
29. 處理RSelenium錯誤消息
30. RSelenium不打開網頁