2016-07-06 57 views

回答

1

這裏就是你可以嘗試

編譯你的鏈接在一個CSV文件,因爲我在鏈接看到的唯一變化是srepID末,做到這一點,如下圖所示:

> head(links) 
            links 
1 http://www.nature.com/articles/srep20000 
2 http://www.nature.com/articles/srep20001 
3 http://www.nature.com/articles/srep20002 
4 http://www.nature.com/articles/srep20003 
5 http://www.nature.com/articles/srep20004 
6 http://www.nature.com/articles/srep20005 

然後運行下面的代碼:

library(rvest) 
links <- read.csv("link.csv",T,"~") 



for (i in 1:nrow(links)) { 

url <- read_html(as.character(links[i,1])) 

#Upload 

links[i,2] <- url %>% 
     html_node("dd:nth-child(2) time") %>% 
     html_text() %>% 
     as.character() 

#Accepted 

links[i,3] <- url %>% 
    html_node("dd:nth-child(4) time") %>% 
    html_text() %>% 
    as.character() 



} 

colnames(links)[2] <- "Received" 
colnames(links)[3] <- "Accepted" 

你會得到的結果爲:

> head(links) 
            links   Received   Accepted 
1 http://www.nature.com/articles/srep20000 15 October 2015 22 December 2015 
2 http://www.nature.com/articles/srep20001 21 October 2015 22 December 2015 
3 http://www.nature.com/articles/srep20002 20 October 2015 22 December 2015 
4 http://www.nature.com/articles/srep20003 10 November 2015 22 December 2015 
5 http://www.nature.com/articles/srep20004 15 November 2015 22 December 2015 
6 http://www.nature.com/articles/srep20005 09 November 2015 22 December 2015 

注意:URL的數量越多,代碼的完成時間就越長。此外,該網站不允許在其網頁上使用botic動作,因此無法使用任何替代方式向您提供所有信息。

+0

工作正常。謝謝 – yliueagle