0
問題在改進循環使用其他方法
有1個主站(df
)和堆疊在單個data.frame與值三天3個本地站(s
)。我們的想法是每天從主站獲取三個本地站的相對異常,並使用phylin
包中的反距離加權(IDW)對其進行平滑處理。然後通過乘法應用於主站中的value
。
關於改進此代碼的任何建議(例如data.table
,dplyr
,apply
)?如果沒有繁瑣的for
循環,我仍然不知道如何處理這個問題。
dput
s <- structure(list(id = c("USC00031152", "USC00034638", "USC00036352",
"USC00031152", "USC00034638", "USC00036352", "USC00031152", "USC00034638",
"USC00036352"), lat = c(33.59, 34.7392, 35.2833, 33.59, 34.7392,
35.2833, 33.59, 34.7392, 35.2833), long = c(-92.8236, -90.7664,
-93.1, -92.8236, -90.7664, -93.1, -92.8236, -90.7664, -93.1),
year = c(1900, 1900, 1900, 1900, 1900, 1900, 1900, 1900,
1900), month = c(1, 1, 1, 1, 1, 1, 1, 1, 1), day = c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), value = c(63.3157576809045,
86.0490598902219, 76.506386949066, 71.3760752788486, 89.9119576975542,
76.3535163951321, 53.7259645981243, 61.7989638892985, 85.8911224149051
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-9L), .Names = c("id", "lat", "long", "year", "month", "day",
"value"))
df <- structure(list(id = c(12345, 12345, 12345), lat = c(100, 100,
100), long = c(50, 50, 50), year = c(1900, 1900, 1900), month = c(1,
1, 1), day = 1:3, value = c(54.8780020601509, 106.966029162171,
98.3198828955801)), row.names = c(NA, -3L), class = "data.frame", .Names = c("id",
"lat", "long", "year", "month", "day", "value"))
代碼
library(phylin)
nearest <- function(i, loc){
# Stack 3 local stations
stack <- s[loc:(loc+2),]
# Get 1 main station
station <- df[i,]
# Check for NA and build relative anomaly (r)
stack <- stack[!is.na(stack$value),]
stack$r <- stack$value/station$value
# Use IDW and return v
v <- as.numeric(ifelse(dim(stack)[1] == 1,
stack$r,
idw(stack$r, stack[,c(2,3,8)], station[,2:3])))
return(v)
}
ncdc <- 1
for (i in 1:nrow(df)){
# Get relative anomaly from function
r <- nearest(i, ncdc)
# Get value from main station and apply anomaly
p <- df[i,7]
df[i,7] <- p*r
# Iterate to next 3 local stations
ncdc <- ncdc + 3
}
謝謝。這是'apply'函數的一個很好的用法,我想知道它是否可以並行以提高速度。 – Vedda
對於並行版本的應用,您應該查看'multicore'軟件包中的'mcapply'。 –