library(XML)
my_URL <- "http://www.velocitysharesetns.com/viix"
tables <- readHTMLTable(my_URL)
上述輸出只是位於頁面頂部的表。它看起來像餅圖被忽略,並且javascript解釋它。有沒有簡單的解決方案來提取圖表中的兩個百分比數字?
看了看RSelenium
但是我收到一些錯誤,我一直沒能找到任何解決方案。
> RSelenium::startServer()
Error in if (file.exists(file) == FALSE) if (!missing(asText) && asText == :
argument is of length zero
In addition: Warning messages:
1: startServer is deprecated.
Users in future can find the function in file.path(find.package("RSelenium"), "example/serverUtils").
The sourcing/starting of a Selenium Server is a users responsiblity.
Options include manually starting a server see vignette("RSelenium-basics", package = "RSelenium")
and running a docker container see vignette("RSelenium-docker", package = "RSelenium")
2: running command '"java" -jar "\\med-fs01/Home/Alex.Badoi/R/win-library/3.3/RSelenium/bin/selenium-server-standalone.jar" -log "\\med-fs01/Home/Alex.Badoi/R/win-library/3.3/RSelenium/bin/sellog.txt"' had status 127
3: running command '"wmic" path win32_process get Caption,Processid,Commandline /format:htable' had status 44210
>
根據菲利普的回答我想出了一個流動的解決方案:
library(XML)
# extarct HTML
doc.html = htmlTreeParse('http://www.velocitysharesetns.com/viix',
useInternal = TRUE)
# convert to text
htmltxt <- paste(capture.output(doc.html, file=NULL), collapse="\n")
# get location of string
pos = regexpr('CBOE SHORT-TERM VIX FUTURE', htmltxt)
# extarct from "pos" to nchar to end of string
keep = substr(htmltxt, pos, pos+98)
輸出:
> keep
[1] "CBOE SHORT-TERM VIX FUTURE DEC 2016', 81.64],\n\n ['CBOE SHORT-TERM VIX FUTURE JAN 2017', 18.36],\n"
在運行'startServer()'之前,你運行'checkForServer(update = TRUE)'嗎? 此外,看看你的瀏覽器(例如Firefox的F12)的檢查員,看看你想要獲取的數據是否可以在那裏被識別。 – PhillipD
嗨Phillip,好像checkForServer命令也被棄用。對於使用chrome的第二個問題,右鍵單擊,檢查元素。我不知道任何Java,但2代碼不顯示在代碼中。 –