2015-11-08 106 views
0

我正在從一組數據(set1)到另一組數據(set2)找到5個最近的車站。 This帖子是我用作基礎,它似乎很容易找到最接近的,但我正在寫for循環來處理它,並沒有效率。此外,我越來越和錯誤,不明白爲什麼它不工作。理想情況下,我想用set1查找set2附近最近的車站,發現最近的5個車站,併爲每個車站添加一個列,每個唯一的標識符爲set1查找離緯度/經度座標最近的5個車站

編輯:這個問題不同於How to assign a name to lat-long observations based on shortest distance因爲我試圖找到5個最近的站,而不僅僅是一個單一的距離。此外,找到最小值的方法也不同。請重新打開這個問題。

dput:

set1 <- structure(list(id = c(5984, 7495, 4752, 2654, 4578, 9865, 3265, 
1252, 4679, 1346), lat = c(48.39167, 48.148056, 48.721111, 47.189167, 
47.054443, 47.129166, 47.306667, 47.84, 47.304167, 48.109444), 
    lon = c(13.671114, 12.866947, 15.94223, 11.099736, 12.958342, 
    14.203892, 11.86389, 16.526674, 16.193064, 17.071392)), row.names = c(NA, 
10L), class = "data.frame", .Names = c("id", "lat", "lon")) 

set2 <- structure(list(id = 1:10, lat = structure(c(35.8499984741211, 
34.75, 70.9329986572266, 78.25, 69.6829986572266, 74.515998840332, 
70.3659973144531, 67.265998840332, 63.6990013122559, 60.1990013122559 
), .Dim = 10L), lon = structure(c(14.4829998016357, 32.4000015258789, 
-8.66600036621094, 15.4670000076294, 18.9160003662109, 19.0160007476807, 
31.0990009307861, 14.3660001754761, 9.59899997711182, 11.0830001831055 
), .Dim = 10L)), row.names = c(NA, 10L), class = "data.frame", .Names = c("id", 
"lat", "lon")) 

代碼:

library(rgeos) 
library(sp) 


set1sp <- SpatialPoints(set1) 
set2sp <- SpatialPoints(set2) 
for (i in length(set1$id)){ 
    for (j in 4:9){ 
    if(i == 1) { 
     sub <- set2 
     set1[i,j] <- apply(gDistance(set1sp, set2sp, byid=TRUE), 1, which.min) 
     sub <- filter(sub, id != set1[i,j])} 
    else{ 
     set1[i,j] <- apply(gDistance(set1sp, set2sp, byid=TRUE), 1, which.min) 
     sub <- filter(sub, id != set1[i,j])} 
    } 
} 

輸出錯誤:

Error in `[<-.data.frame`(`*tmp*`, i, j, value = c(8L, 8L, 8L, 8L, 8L, : 
    replacement has 10 rows, data has 1 
+0

很可能正在生成的錯誤,因爲你缺少一個'1:''之前的長度(集1的$ id)' –

+0

什麼是'set1sp'和'set2sp'?他們沒有定義。另外,您的要點是什麼投影系統?如果你只想添加5列,你可能需要'j在4:8'而不是9。 –

+0

@JaredSmith對不起,我添加了set1sp和set2sp – Vedda

回答

1

我必須設置投影系統,爲了使對set1spset2sp座標gDistance工作。我假定WGS84。

dummyset1= set1 
dummyset2= set2 
coordinates(set1) = c('lon', 'lat') 
coordinates(set2) = c('lon', 'lat') 
proj4string(set1) = "+proj=longlat +datum=WGS84" 
proj4string(set2) = "+proj=longlat +datum=WGS84" 
set1sp = set1 
set2sp = set2 
set1 = dummyset1 
set2 = dummyset2 

該循環將根據使用for循環的一般結構返回所需的輸出。

for (i in 1:length(set1$id)){ 
    #Store the projected data in a dummy variable sub 
    sub <- set2sp 
    for (j in 4:8){ 
     if (j == 4){ 
      set1[i,j] <- apply(gDistance(set2sp['id'], set1sp['id'][i,], byid=TRUE), 1, which.min) 
      #Remove the index of the closest point from sub. 
      sub <- sub[which(sub$id != set1[i,j]), ] 
     } 
     else { 
      #Note that sub is now being checked instead of set2sp. This is because sub has had the index of the closest point removed. 
      set1[i,j] <- apply(gDistance(sub['id'], set1sp['id'][i,], byid=TRUE), 1, which.min) 
      sub <- sub[which(sub$id != set1[i,j]), ] 
     } 
    } 
} 

產生的輸出是:

set1 
    id  lat  lon V4 V5 V6 V7 V8 
1 5984 48.39167 13.67111 10 1 8 7 6 
2 7495 48.14806 12.86695 10 1 8 7 6 
3 4752 48.72111 15.94223 10 1 8 7 6 
4 2654 47.18917 11.09974 1 9 8 7 6 
5 4578 47.05444 12.95834 1 9 8 7 6 
6 9865 47.12917 14.20389 1 9 8 7 6 
7 3265 47.30667 11.86389 1 9 8 7 6 
8 1252 47.84000 16.52667 1 9 8 7 6 
9 4679 47.30417 16.19306 1 9 8 7 6 
10 1346 48.10944 17.07139 1 9 8 7 6 
+0

這肯定有效,但我得到這個錯誤:'50:在RGEOSDistanceFunc(spgeom1,spgeom2,byid,「rgeos_distance」): 空間對象2不投影; GEOS期望平面座標' – Vedda

+1

我認爲警告是使用度座標而不是平面座標(例如UTM座標)的結果。您可以使用'spTransform()'將數據轉換爲適合您所在地區的平面系統。 –

+1

嘗試'order(spDists(set1,set2),2,min))[1:5]':'spDists'計算長/長數據的大圓距矩陣,如上定義。 'apply'對每一列取最小值(set2中的點數,超過set1),order和'[1:5]'找到最接近的五個。這不是如上所述的重複問題。 –

1

以下組中2 WRT設置1.然後,它取最小值超過設定1,並且他們的訂單計算從所有點大圓距離;然後重複。

library(sp) 
coordinates(set1) = c('lon', 'lat') 
coordinates(set2) = c('lon', 'lat') 
proj4string(set1) = "+proj=longlat +datum=WGS84" 
proj4string(set2) = "+proj=longlat +datum=WGS84" 
d = apply(spDists(set1,set2),2,min) 
order(d)[1:5] 
# [1] 1 10 9 2 8 
plot(set2, pch=2, axes=TRUE) 
points(set1) 
o = order(d)[1:5] 
points(set2[o,], col = 'red', pch=16) 

enter image description here