我從網上下載天氣數據。爲此,我創建了簡單的for循環,它將包含數據的數據框添加到列表中(一個城市的一個列表)。它的工作正常,但如果沒有數據(沒有天氣情況的網頁上的特定日期的表)它會返回一個錯誤 - 例如對此URL(「https://www.wunderground.com/history/airport/EPLB/2015/12/25/DailyHistory.html?req_city=Abramowice%20Koscielne&req_statename=Poland」)。如果錯誤比在R
Error in Lublin[i] <- url4 %>% read_html() %>% html_nodes(xpath = "//*[@id=\"obsTable\"]") %>% :
replacement has length zero
如何在發生錯誤並將其放入列表時如何將if語句返回與NA的行(13個觀察值)?
還有更快的方式來下載所有的數據比for循環嗎?
我的代碼:
c<-seq(as.Date("2015/1/1"), as.Date("2016/12/31"), "days")
Warszawa <- list()
Wroclaw <- list()
Bydgoszcz <- list()
Lublin <- list()
Gorzow <- list()
Lodz <- list()
Krakow <- list()
Opole <- list()
Rzeszow <- list()
Bialystok <- list()
Gdansk <- list()
Katowice <- list()
Kielce <- list()
Olsztyn <- list()
Poznan <- list()
Szczecin <- list()
date <- list()
for(i in 1:length(c)) {
y<-as.numeric(format(c[i],'%Y'))
m<-as.numeric(format(c[i],'%m'))
d<-as.numeric(format(c[i],'%d'))
date[i] <- c[i]
url1 <- sprintf("https://www.wunderground.com/history/airport/EPWA/%d/%d/%d/DailyHistory.html?req_city=Warszawa&req_state=MZ&req_statename=Poland", y, m, d)
url2 <- sprintf("https://www.wunderground.com/history/airport/EPWR/%d/%d/%d/DailyHistory.html?req_city=Wrocław&req_statename=Poland", y, m, d)
url3 <- sprintf("https://www.wunderground.com/history/airport/EPBY/%d/%d/%d/DailyHistory.html?req_city=Bydgoszcz&req_statename=Poland", y, m, d)
url4 <- sprintf("https://www.wunderground.com/history/airport/EPLB/%d/%d/%d/DailyHistory.html?req_city=Abramowice%%20Koscielne&req_statename=Poland", y, m, d)
url5 <- sprintf("https://www.wunderground.com/history/airport/EPZG/%d/%d/%d/DailyHistory.html?req_city=Gorzow%%20Wielkopolski&req_statename=Poland", y, m, d)
url6 <- sprintf("https://www.wunderground.com/history/airport/EPLL/%d/%d/%d/DailyHistory.html?req_city=Lodz&req_statename=Poland", y, m, d)
url7 <- sprintf("https://www.wunderground.com/history/airport/EPKK/%d/%d/%d/DailyHistory.html?req_city=Krakow&req_statename=Poland", y, m, d)
url8 <- sprintf("https://www.wunderground.com/history/airport/EPWR/%d/%d/%d/DailyHistory.html?req_city=Opole&req_statename=Poland", y, m, d)
url9 <- sprintf("https://www.wunderground.com/history/airport/EPRZ/%d/%d/%d/DailyHistory.html?req_city=Rzeszow&req_statename=Poland", y, m, d)
url10 <- sprintf("https://www.wunderground.com/history/airport/UMMG/%d/%d/%d/DailyHistory.html?req_city=Dojlidy&req_statename=Poland", y, m, d)
url11 <- sprintf("https://www.wunderground.com/history/airport/EPGD/%d/%d/%d/DailyHistory.html?req_city=Gdansk&req_statename=Poland", y, m, d)
url12 <- sprintf("https://www.wunderground.com/history/airport/EPKM/%d/%d/%d/DailyHistory.html?req_city=Katowice&req_statename=Poland", y, m, d)
url13 <- sprintf("https://www.wunderground.com/history/airport/EPKT/%d/%d/%d/DailyHistory.html?req_city=Chorzow%%20Batory&req_statename=Poland", y, m, d)
url14 <- sprintf("https://www.wunderground.com/history/airport/EPSY/%d/%d/%d/DailyHistory.html", y, m, d)
url15 <- sprintf("https://www.wunderground.com/history/airport/EPPO/%d/%d/%d/DailyHistory.html?req_city=Poznan%%20Old%%20Town&req_statename=Poland", y, m, d)
url16 <- sprintf("https://www.wunderground.com/history/airport/EPSC/%d/%d/%d/DailyHistory.html?req_city=Szczecin&req_statename=Poland", y, m, d)
Warszawa[i] <- url1 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Wroclaw[i] <- url2 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Bydgoszcz[i] <- url3 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Lublin[i] <- url4 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Gorzow[i] <- url5 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Lodz[i] <- url6 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Krakow[i] <- url7 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Opole[i] <- url8 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Rzeszow[i] <- url9 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Bialystok[i] <- url10 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Gdansk[i] <- url11 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Katowice[i] <- url12 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Kielce[i] <- url13 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Olsztyn[i] <- url14 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Poznan[i] <- url15 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
Szczecin[i] <- url16 %>%
read_html() %>%
html_nodes(xpath='//*[@id="obsTable"]') %>%
html_table()
}
感謝您的幫助。
你可以用'tryCatch'爲 –
提示:使用'c'作爲變量。因爲它用於在R中創建矢量。 –
您也有相當多的重複代碼。我認爲你可以創建一個功能,在換出你需要的變量時做同樣的事情。至於錯誤,我會遵循@docendo discimus的建議。 –