2017-05-31 99 views
1

我有32K行地址,我必須找到長/緯度值。帶地址的R地理編碼

我使用的代碼是here。我非常感謝這個人創建它,但我有一個問題:

我想編輯它,以便如果循環遇到當前行的地址的問題,它只是在緯/長字段並移至下一個字段。有誰知道如何完成?代碼如下:

# Geocoding a csv column of "addresses" in R 

#load ggmap 
library(ggmap) 

# Select the file from the file chooser 
fileToLoad <- file.choose(new = TRUE) 

# Read in the CSV data and store it in a variable 
origAddress <- read.csv(fileToLoad, stringsAsFactors = FALSE) 

# Initialize the data frame 
geocoded <- data.frame(stringsAsFactors = FALSE) 

# Loop through the addresses to get the latitude and longitude of each address and add it to the 
# origAddress data frame in new columns lat and lon 
for(i in 1:nrow(origAddress)) 
{ 
    # Print("Working...") 
    result <- geocode(origAddress$addresses[i], output = "latlona", source = "google") 
    origAddress$lon[i] <- as.numeric(result[1]) 
    origAddress$lat[i] <- as.numeric(result[2]) 
    origAddress$geoAddress[i] <- as.character(result[3]) 
} 
# Write a CSV file containing origAddress to the working directory 
write.csv(origAddress, "geocoded.csv", row.names=FALSE) 

回答

5

您可以使用tryCatch()隔離地理編碼警告,並具有相同的結構(LON,緯度,地址)爲geocode()將返回返回data.frame。然後

你的代碼將是

# Geocoding a csv column of "addresses" in R 

# load ggmap 
library(ggmap) 

# Select the file from the file chooser 
fileToLoad <- file.choose(new = TRUE) 

# Read in the CSV data and store it in a variable 
origAddress <- read.csv(fileToLoad, stringsAsFactors = FALSE) 

# Loop through the addresses to get the latitude and longitude of each address and add it to the 
# origAddress data frame in new columns lat and lon 
for(i in 1:nrow(origAddress)) { 
    result <- tryCatch(geocode(origAddress$addresses[i], output = "latlona", source = "google"), 
        warning = function(w) data.frame(lon = NA, lat = NA, address = NA)) 
    origAddress$lon[i] <- as.numeric(result[1]) 
    origAddress$lat[i] <- as.numeric(result[2]) 
    origAddress$geoAddress[i] <- as.character(result[3]) 
} 
# Write a CSV file containing origAddress to the working directory 
write.csv(origAddress, "geocoded.csv", row.names=FALSE) 

或者,您也可以做到這一點更快,更乾淨無環路和錯誤檢查。但是,如果沒有可重複使用的數據示例,則無法知道這是否會保留您所需的全部信息。

# Substituted for for loop 
result <- geocode(origAddress$addresses, output = "latlona", source = "google") 
origAddress <- cbind(origAddress$addresses, result) 
+1

這工作!非常感謝你@Ben Fasoli! – Walker

+0

我也遇到過這個問題......有沒有人有任何建議? '查詢最大超過,請參閱?地理編碼。當前總計= 2500' – Walker

+0

查看geocode()'文檔,嘗試設置'override_limit = TRUE' –