我正在試圖在「df1」定義的時間段內計算「df2」中的事件數(每行是一個事件)。我能夠在整個時間段執行此操作大約5分鐘,但是我想破的時間段成較小的塊(1分鐘),並做同樣的計算如何計算特定時間段內的事件數
df1<- structure(list(Location = 1:10, Lattitude = c(57.140532, 57.140527,
57.13959, 57.13974, 57.14059, 57.14058, 57.1398, 57.13989, 57.14158,
57.14386), t_in = structure(c(1455626730, 1455627326, 1455628122,
1455628644, 1455629174, 1455629708, 1455630230, 1455630765, 1455631396,
1455631931), class = c("POSIXct", "POSIXt"), tzone = ""), t_out = structure(c(1455627047,
1455627615, 1455628462, 1455628933, 1455629486, 1455630015, 1455630552,
1455631070, 1455631719, 1455632242), class = c("POSIXct", "POSIXt"
), tzone = "")), .Names = c("Location", "Lattitude", "t_in",
"t_out"), class = "data.frame", row.names = c(NA, -10L))
df2<- structure(list(date.time = structure(c(1455630964, 1455630976,
1455630987, 1455630998, 1455631009, 1455631021, 1455631032, 1455631043,
1455631054, 1455631066, 1455631077, 1455631088, 1455631099, 1455631111,
1455631423, 1455631446, 1455631479, 1455631502, 1455631569, 1455631772
), class = c("POSIXct", "POSIXt"), tzone = ""), code = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L), .Label = c("1003", "32221"), class = "factor"),
rec_id = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("301976",
"301978", "301985", "301988"), class = "factor"), Lattitude = c("57.14066",
"57.14066", "57.14066", "57.14066", "57.14066", "57.14066",
"57.14066", "57.14066", "57.14066", "57.14066", "57.14066",
"57.14066", "57.14066", "57.14066", "57.141869", "57.141869",
"57.141869", "57.141869", "57.141869", "57.141869"), Longitude = c("2.075702",
"2.075702", "2.075702", "2.075702", "2.075702", "2.075702",
"2.075702", "2.075702", "2.075702", "2.075702", "2.075702",
"2.075702", "2.075702", "2.075702", "2.081576", "2.081576",
"2.081576", "2.081576", "2.081576", "2.081576"), Location = list(
8, 8, 8, 8, 8, 8, 8, 8, 8, 8, NA, NA, NA, NA, 9, 9, 9,
9, 9, NA)), .Names = c("date.time", "code", "rec_id",
"Lattitude", "Longitude", "Location"), row.names = 94:113, class = "data.frame")
功能從DF1如果返回的位置df2中的date.time位於df1 $ t_in和df1 $ t_out之間。這似乎周圍的方式,但後來能夠計算outwith這個代碼
ids <- as.numeric(df1$Location)
f <- function(x){
a <- ids[ (df1$t_in < x) & (x < df1$t_out) ]
if (length(a) == 0) NA else a
}
df2$Location <- lapply(df2$date.time, f)
上述返回一個列表,所以需要把它變成數字。 有點faff的,但不能獲得圓它
df2$Location<- paste(df2$Location)
df2$Location<- as.numeric(df2$Location)
NA的隨後作爲這些在於DF1,因此不相關定義的時間段以外的去除。
df2<-df2[!is.na(df2$Location),]
然後計算(即每行)活動的數量爲每·REC_ID和位置
library (plyr)
df3 <- ddply(df2, c("rec_id","Location"), function(df){data.frame (detections=nrow(df))})
rec_id Location detections
1 301976 9 5
2 301978 8 10
...完美!
但是我想在較短的時間內做到這一點。每一刻都是確切的。週期應該從t_in(df1)開始,直到t_out(df1)。我可以在Excel中做很多工作,但肯定可以在R中自動執行(這是一個大型數據集)。
所以最終我可以在DF1計數在每個位置處的事件(nrow),用於T_IN和度T_out之間各1分鐘的時間段的數量
如(只是視覺例子不是實際的數據):
rec_id Location minute(or period) detections
301976 9 1 1
301976 9 2 2
301976 9 3 0
301976 9 4 0
301976 9 5 2
301978 8 1 4
301978 8 2 3
301978 8 3 1
301978 8 4 0
301978 8 5 2
我可以從第一位置創建間隔但林不知道如何運用這種進一步
seq(from = head(df1$t_in,1), to = head(df1$t_out,1) , by = "mins")
謝謝!作品完美。加快不重要! –