0
我是r和rvest的新手。兩天前我得到了這個代碼的幫助,這個代碼可以清除所有玩家的名字,並且效果很好。現在,我正在嘗試添加代碼以實現「fetch_current_players」功能,並在其中創建該網站的播放器代碼矢量(從網址中取出)。任何幫助,將不勝感激,因爲我花了一天谷歌搜索,閱讀,並觀看YouTube視頻試圖教我自己。謝謝!刮掉URL中的「字符串」代碼並使用rvest將其放入向量R
library(rvest)
library(purrr) # flatten/map/safely
library(dplyr) # progress bar
fetch_current_players <- function(letter){
URL <- sprintf("http://www.baseball-reference.com/players/%s/", letter)
pg <- read_html(URL)
if (is.null(pg)) return(NULL)
player_data <- html_nodes(pg, "b a")
player_code<-html_attr(html_nodes(pg, "b a"), "href") #I'm trying to scrape the URL as well as the player name
substring(player_code, 12, 20) #Strips the code out of the URL
html_text(player_data)
player_code #Not sure how to create vector of all codes from all 27 webpages
}
pb <- progress_estimated(length(letters))
player_list <- flatten_chr(map(letters, function(x) {
pb$tick()$print()
fetch_current_players(x)
}))
謝謝,完美的工作! – Nitreg