2016-10-13 101 views
1

我試圖對兩個數據幀中的值進行評估,並創建一個包含結果的新數據框。我對R的力量很陌生,我試圖避免舊的編碼習慣。換句話說,我拼命地試圖避免使用循環,但在這種情況下無法找出plyr之類的東西。在R中,評估兩個數據幀之間的函數

在示例中,我創建了機場,飛行員和用公里計算距離的函數。我的問題是試圖確定每個飛行員最接近哪個主要機場以及每個機場的距離。

#Build Airports 
code <- c("IAH", "DFW", "Denver", "STL") 
lat <- c(29.97, 32.90, 39.75, 38.75) 
long <- c(95.35, 97.03, 104.87, 90.37) 
airports <- data.frame(code, lat, long) 

#Build Pilots 
names <- c("James", "Fiona", "Seamus") 
lat <- c(32.335131, 44.913223, 28.849631) 
long <- c(-84.989067, -97.151334, -96.917240) 
pilots <- data.frame(names, lat, long) 

#Create distance function 
distInKm <- function(lat1, long1, lat2, long2) { 
    dlat = (lat2 * 0.01745329) - (lat1 * 0.01745329) #pi/180 convert to radians 
    dlong = (long2 * 0.01745329) - (long1 * 0.01745329) 
    step1 = (sin(dlat/2))^2 + cos(lat1 * 0.01745329) * cos(long2 * 0.01745329) * (sin(dlong/2))^2 
    step2 = 2 * atan2(sqrt(step1), sqrt(1 - step1)) 
    dist = 6372.798 * step2 #R is the radius of earth (40041.47/(2 * pi)) 
    dist 
} 

謝謝你的時間。

回答

3

首先,您的機場經濟是積極的,他們應該是負面的,這將甩掉結果。讓我們來解決他們如此結果更有意義:現在

airports$long <- -airports$long 

,您可以使用apply來評估所有的飛行員對每個機場。 geosphere包有幾個函數可以計算直線距離,包括distGeodistHaversine

library(geosphere) 

pilots$closest_airport <- apply(pilots[, 3:2], 1, function(x){ 
    airports[which.min(distGeo(x, airports[, 3:2])), 'code'] 
}) 

pilots$airport_distance <- apply(pilots[, 3:2], 1, function(x){ 
    min(distGeo(x, airports[, 3:2]))/1000 # /1000 to convert m to km 
}) 

pilots 
## names  lat  long closest_airport airport_distance 
## 1 James 32.33513 -84.98907    STL   862.5394 
## 2 Fiona 44.91322 -97.15133   Denver   855.8088 
## 3 Seamus 28.84963 -96.91724    IAH   196.3559 

,或者如果你希望所有的距離,而不是僅僅最小的一個,cbindapply得到的矩陣:

pilots <- cbind(pilots, t(apply(pilots[, 3:2], 1, function(x){ 
    setNames(distGeo(x, airports[, 3:2])/1000, airports$code) 
}))) 

pilots 
## names  lat  long closest_airport  IAH  DFW Denver  STL 
## 1 James 32.33513 -84.98907    STL 1021.6523 1131.2129 1965.6586 862.5394 
## 2 Fiona 44.91322 -97.15133   Denver 1666.0359 1333.6842 855.8088 885.8480 
## 3 Seamus 28.84963 -96.91724    IAH 196.3559 449.1838 1412.0664 1253.4874 

翻譯成dplyr,繼任者plyr

library(dplyr) 

pilots %>% rowwise() %>% 
     mutate(closest_airport = airports[which.min(distGeo(c(long, lat), airports[, 3:2])), 'code'], 
       airport_distance = min(distGeo(c(long, lat), airports[, 3:2]))/1000) 

## Source: local data frame [3 x 5] 
## Groups: <by row> 
## 
## # A tibble: 3 × 5 
## names  lat  long closest_airport airport_distance 
## <fctr> <dbl>  <dbl>   <fctr>   <dbl> 
## 1 James 32.33513 -84.98907    STL   862.5394 
## 2 Fiona 44.91322 -97.15133   Denver   855.8088 
## 3 Seamus 28.84963 -96.91724    IAH   196.3559 

或所有的距離,使用bind_cols與上面的方法,或unnest一個列表列,重塑:

library(tidyverse) 

pilots %>% rowwise() %>% 
    mutate(closest_airport = airports[which.min(distGeo(c(long, lat), airports[, 3:2])), 'code'], 
      data = list(data_frame(airport = airports$code, 
            distance = distGeo(c(long, lat), airports[, 3:2])/1000))) %>% 
    unnest() %>% 
    spread(airport, distance) 

## # A tibble: 3 × 8 
## names  lat  long closest_airport Denver  DFW  IAH  STL 
## * <fctr> <dbl>  <dbl>   <fctr>  <dbl>  <dbl>  <dbl>  <dbl> 
## 1 Fiona 44.91322 -97.15133   Denver 855.8088 1333.6842 1666.0359 885.8480 
## 2 James 32.33513 -84.98907    STL 1965.6586 1131.2129 1021.6523 862.5394 
## 3 Seamus 28.84963 -96.91724    IAH 1412.0664 449.1838 196.3559 1253.4874 

或者更直接但不清晰,

pilots %>% rowwise() %>% 
    mutate(closest_airport = airports[which.min(distGeo(c(long, lat), airports[, 3:2])), 'code'], 
      data = (distGeo(c(long, lat), airports[, 3:2])/1000) %>% 
        setNames(airports$code) %>% t() %>% as_data_frame() %>% list()) %>% 
    unnest() 

## # A tibble: 3 × 8 
## names  lat  long closest_airport  IAH  DFW Denver  STL 
## <fctr> <dbl>  <dbl>   <fctr>  <dbl>  <dbl>  <dbl>  <dbl> 
## 1 James 32.33513 -84.98907    STL 1021.6523 1131.2129 1965.6586 862.5394 
## 2 Fiona 44.91322 -97.15133   Denver 1666.0359 1333.6842 855.8088 885.8480 
## 3 Seamus 28.84963 -96.91724    IAH 196.3559 449.1838 1412.0664 1253.4874 
+0

OP正試圖確定哪些主要機場各試點最接近不是哪個飛行員距離每個機場最近 – HubertL

+0

@HubertL哎呀,向後看。固定。 – alistaire

+0

你往回讀它導致它向後寫 – HubertL