2017-06-29 27 views
0

所以我期望從TransferMarkt中獲取數據,特別是總理聯盟中玩家的名字和網址。我通過首先抓取聯盟中所有球隊的網址,然後通過這些球隊的個人球員來做這件事。我遇到的問題是數據沒有保存到框架中。當我試圖找到行數(no.of.rows)時,它仍然是零,所以我試圖打印框架(Catcher1),看看發生了什麼,它是空的!任何幫助將不勝感激,謝謝。網頁搜索:數據幀沒有填充

library(rvest) 

URL <- "http://www.transfermarkt.com/premier-league/startseite/wettbewerb/GB1" 

WS <- read_html(URL) 

URLs <- WS %>% html_nodes(".hide-for-pad .vereinprofil_tooltip") %>% html_attr("href") %>% as.character() 
URLs <- paste0("http://www.transfermarkt.com",URLs) 

Catcher1 <- data.frame(Player=character(),P_URL=character()) 

for (i in URLs) { 
    WS1 <- read_html(i) 
    Player <- WS1 %>% html_nodes("#yw1 .tooltipstered")%>%html_text()%>%as.character() 
    P_URL <- WS1 %>% html_nodes("#yw1 .tooltipstered")%>%html_attr("href")%>%as.character() 
    temp <- data.frame(Player,P_URL) 
    Catcher1 <- rbind(Catcher1,temp) 
    cat("*") 
} 

print(Catcher1) 
no.of.rows <- nrow(Catcher1) 
odd_indexes<-seq(1,no.of.rows,2) 
Catcher1 <- data.frame(Catcher1[odd_indexes,]) 

Catcher1$P_URL <- paste0("http://www.transfermarkt.com",Catcher1$P_URL) 

回答

0

我沒有看到一個#yw1 ID,但這個有着足夠具體的CSS選擇器來得到你想要的(雖然我RLY不知道,因爲我不這種類型的體育刮搞也不要跟隨這項運動)。

library(rvest) 
library(tidyverse) 

URL <- "http://www.transfermarkt.com/premier-league/startseite/wettbewerb/GB1" 

WS <- read_html(URL) 

html_nodes(WS, ".hide-for-pad .vereinprofil_tooltip") %>% 
    html_attr("href") %>% 
    sprintf("http://www.transfermarkt.com%s", .) -> URLs 

pb <- progress_estimated(length(URLs)) 
map_df(URLs, ~{ 

    pb$tick()$print() 

    Sys.sleep(sample(3:6, 1)) # be kind to the remote site since you're using a robot vs a human and you have time 

    tmp <- read_html(.x) 

    data_frame(
    player = html_nodes(tmp, "td > div:first-of-type > span > a.spielprofil_tooltip") %>% html_text(), 
    url = html_nodes(tmp, "td > div:first-of-type > span > a.spielprofil_tooltip") %>% html_attr("href") 
) 

}) -> players_df 

players_df 
## # A tibble: 571 x 2 
##    player          url 
##    <chr>         <chr> 
## 1 Thibaut Courtois /thibaut-courtois/profil/spieler/108390 
## 2 Asmir Begovic  /asmir-begovic/profil/spieler/33873 
## 3   Eduardo   /eduardo/profil/spieler/34159 
## 4 Jamal Blackman /jamal-blackman/profil/spieler/128898 
## 5  David Luiz  /david-luiz/profil/spieler/46741 
## 6  Gary Cahill  /gary-cahill/profil/spieler/27511 
## 7  Kurt Zouma  /kurt-zouma/profil/spieler/157509 
## 8  Nathan Aké  /nathan-ake/profil/spieler/177476 
## 9  Tomás Kalas  /tomas-kalas/profil/spieler/148657 
## 10  John Terry   /john-terry/profil/spieler/3160 
## # ... with 561 more rows 
+0

絕對漂亮,謝謝! –

+0

如果它有效,它可以幫助其他人勾選答案框,以便他們知道這是一個可行的答案 – hrbrmstr