在R XML Xpath中，@ href返回文本「href」

我試圖使用這些two posts中描述的Xpath代碼來獲取href的內容。不幸的是，代碼正在返回除了URL之外的實際文本「href」和幾個空格。我怎樣才能避免這種情況？在R XML Xpath中，@ href返回文本「href」

library(XML) 

html <- readLines("http://www.msu.edu") 
html.parse <- htmlParse(html) 
Node <- getNodeSet(html.parse, "//div[@id='MSU-top-utilities']//a/@href") 
Node[[1]] 

# > Node[[1]] 
#     href 
# "students/index.html" 
# attr(,"class") 
# [1] "XMLAttributeValue"

來源

2015-10-03 Kevin M

它只是一個命名的字符向量。你可以這樣做：

as.character(Node[[1]])

，這將給你

## [1] "students/index.html"

或者，這裏是在xml2包一個更好的成語：

library(xml2) 

doc <- read_html("http://www.msu.edu") 
nodes <- xml_find_all(doc, "//div[@id='MSU-top-utilities']//a") 
xml_attr(nodes, "href") 

## [1] "students/index.html"  "faculty-staff/index.html" "alumni/index.html"  
## [4] "businesses/index.html" "visitors/index.html"

來源

2015-10-03 02:38:01 hrbrmstr

在R XML Xpath中，@ href返回文本「href」

回答

相關問題