2016-08-10 76 views
1

我想提取數據從一個XML文件中提供dataframes下:http://www.uniprot.org/uniprot/P43405.xml[R XML XPath查詢返回NULL或列表()

雖然我認爲XPath查詢都還好,我只拿回空字符串。

library(RCurl) 
library(XML) 
url <- "http://www.uniprot.org/uniprot/P43405.xml" 
urldata <- getURL(url) 
xmlfile <- xmlParse(urldata) 

# some xpath queries 
xmlfile["//entry/comment[@type='function']/text"] 
xmlfile["//entry/comment[@type='PTM']/text"] 

xpathSApply(xmlfile,"//uniprot/entry",xmlGetAttr, 'dataset') 
xpathSApply(xmlfile,"//uniprot/entry",xmlValue) 

任何人都可以幫我解決這個問題嗎?

謝謝,弗蘭克

+0

您可以添加您正在閱讀的XML數據樣本嗎? – LordWilmore

回答

1

命名空間缺失:

library(RCurl) 
library(XML) 

url <- "http://www.uniprot.org/uniprot/P43405.xml" 
urldata <- getURL(url) 
xmlfile <- xmlParse(urldata) 

getNodeSet(xmlfile, "//entry//comment") 
namespaces <- c(ns="http://uniprot.org/uniprot") 
getNodeSet(xmlfile, "//ns:entry//ns:comment", namespaces) 

getNodeSet(xmlfile, "//ns:entry//ns:comment[@type='PTM']/ns:text", namespaces) 

xpathSApply(xmlfile,"//ns:uniprot/ns:entry",xmlGetAttr, 'dataset', namespaces=namespaces) 
xpathSApply(xmlfile,"//ns:uniprot/ns:entry",xmlValue, namespaces=namespaces) 

參考文獻:

?xpathApply

How can I use xpath querying using R's XML library?

0

感謝您的幫助! YE,命名空間不見了。我添加了一些額外的代碼。也許這將有助於他人熟悉XML。

library(RCurl) 
library(XML) 

url <- "http://www.uniprot.org/uniprot/P43405.xml" 
urldata <- getURL(url) 
xmlfile <- xmlParse(urldata) 

getNodeSet(xmlfile, "//entry//comment") 

# one needs the name space here 
namespaces <- c(ns="http://uniprot.org/uniprot") 

# extract all comments, make a data frame 
comments.uniprot <- getNodeSet(xmlfile, "//ns:entry//ns:comment", namespaces) 
comments.dataframe <- as.data.frame(sapply(comments.uniprot, xmlValue)) 
comments.attributes <- as.data.frame(sapply(comments.uniprot, xmlGetAttr,'type')) 
comments.all <- cbind(comments.attributes,comments.dataframe) 

# only extract PTM comments 
PTMs <- getNodeSet(xmlfile, "//ns:entry//ns:comment[@type='PTM']/ns:text", namespaces) 
PTMs2 <- sapply(PTMs, xmlValue) 
PTMs2.dataframe <- as.data.frame(PTMs2) 


xpathSApply(xmlfile,"//ns:uniprot/ns:entry",xmlGetAttr, 'dataset', namespaces=namespaces) 
xpathSApply(xmlfile,"//ns:uniprot/ns:entry/ns:accession",xmlValue, namespaces=namespaces)