如何計算特定時間段內的事件數

我正在試圖在「df1」定義的時間段內計算「df2」中的事件數（每行是一個事件）。我能夠在整個時間段執行此操作大約5分鐘，但是我想破的時間段成較小的塊（1分鐘），並做同樣的計算如何計算特定時間段內的事件數

df1<- structure(list(Location = 1:10, Lattitude = c(57.140532, 57.140527, 
57.13959, 57.13974, 57.14059, 57.14058, 57.1398, 57.13989, 57.14158, 
57.14386), t_in = structure(c(1455626730, 1455627326, 1455628122, 
1455628644, 1455629174, 1455629708, 1455630230, 1455630765, 1455631396, 
1455631931), class = c("POSIXct", "POSIXt"), tzone = ""), t_out = structure(c(1455627047, 
1455627615, 1455628462, 1455628933, 1455629486, 1455630015, 1455630552, 
1455631070, 1455631719, 1455632242), class = c("POSIXct", "POSIXt" 
), tzone = "")), .Names = c("Location", "Lattitude", "t_in", 
"t_out"), class = "data.frame", row.names = c(NA, -10L)) 

df2<- structure(list(date.time = structure(c(1455630964, 1455630976, 
1455630987, 1455630998, 1455631009, 1455631021, 1455631032, 1455631043, 
1455631054, 1455631066, 1455631077, 1455631088, 1455631099, 1455631111, 
1455631423, 1455631446, 1455631479, 1455631502, 1455631569, 1455631772 
), class = c("POSIXct", "POSIXt"), tzone = ""), code = structure(c(2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L), .Label = c("1003", "32221"), class = "factor"), 
rec_id = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("301976", 
"301978", "301985", "301988"), class = "factor"), Lattitude = c("57.14066", 
"57.14066", "57.14066", "57.14066", "57.14066", "57.14066", 
"57.14066", "57.14066", "57.14066", "57.14066", "57.14066", 
"57.14066", "57.14066", "57.14066", "57.141869", "57.141869", 
"57.141869", "57.141869", "57.141869", "57.141869"), Longitude = c("2.075702", 
"2.075702", "2.075702", "2.075702", "2.075702", "2.075702", 
"2.075702", "2.075702", "2.075702", "2.075702", "2.075702", 
"2.075702", "2.075702", "2.075702", "2.081576", "2.081576", 
"2.081576", "2.081576", "2.081576", "2.081576"), Location = list(
    8, 8, 8, 8, 8, 8, 8, 8, 8, 8, NA, NA, NA, NA, 9, 9, 9, 
    9, 9, NA)), .Names = c("date.time", "code", "rec_id", 
"Lattitude", "Longitude", "Location"), row.names = 94:113, class = "data.frame")

功能從DF1如果返回的位置df2中的date.time位於df1 $ t_in和df1 $ t_out之間。這似乎周圍的方式，但後來能夠計算outwith這個代碼

ids <- as.numeric(df1$Location) 
f <- function(x){ 
    a <- ids[ (df1$t_in < x) & (x < df1$t_out) ] 
    if (length(a) == 0) NA else a 
} 

df2$Location <- lapply(df2$date.time, f)

上述返回一個列表，所以需要把它變成數字。有點faff的，但不能獲得圓它

df2$Location<- paste(df2$Location) 
df2$Location<- as.numeric(df2$Location)

NA的隨後作爲這些在於DF1，因此不相關定義的時間段以外的去除。

df2<-df2[!is.na(df2$Location),]

然後計算（即每行）活動的數量爲每·REC_ID和位置

library (plyr) 
df3 <- ddply(df2, c("rec_id","Location"), function(df){data.frame (detections=nrow(df))}) 

    rec_id Location detections 
1 301976  9   5 
2 301978  8   10

...完美！

但是我想在較短的時間內做到這一點。每一刻都是確切的。週期應該從t_in（df1）開始，直到t_out（df1）。我可以在Excel中做很多工作，但肯定可以在R中自動執行（這是一個大型數據集）。

所以最終我可以在DF1計數在每個位置處的事件（nrow），用於T_IN和度T_out之間各1分鐘的時間段的數量

如（只是視覺例子不是實際的數據）：

rec_id Location minute(or period) detections 
301976  9    1   1 
301976  9    2   2 
301976  9    3   0 
301976  9    4   0 
301976  9    5   2 
301978  8    1   4 
301978  8    2   3 
301978  8    3   1 
301978  8    4   0 
301978  8    5   2

我可以從第一位置創建間隔但林不知道如何運用這種進一步

seq(from = head(df1$t_in,1), to = head(df1$t_out,1) , by = "mins")

來源

2016-02-20 Salmo salar

我覺得可以用來生成新的下面df1帶有序列分割輸出的數據框，然後您可以將上述步驟與新的df1一起應用。

他們可能會結合，但我只是想確保它實際上得到你想要的。

首先，我們在原始數據框中擴展時間間隔，並生成擴展週期列表。 df1中的每一行都成爲列表中的一個元素。

res1 <- sapply(1:nrow(df1), function(i) { 
       seq(from = df1$t_in[i], to = df1$t_out[i] , by = "mins")})

然後我們序列的列表轉換爲數據幀（兩列）

res2 <- lapply(res1, function(x) { 
       data.frame(t_in = x[1:(length(x)-1)], t_out=x[2:length(x)]) })

最後，我們合併都在一起

df1v2 <- Reduce(function(...) merge(..., all=T), res2)

然後（調整您的代碼）

ids <- seq_len(nrow(df1v2)) 
f <- function(x){ 
    a <- ids[ (df1v2$t_in < x) & (x < df1v2$t_out) ] 
    if (length(a) == 0) NA else a 
} 

df2$Location <- lapply(df2$date.time, f)

這將產生

   date.time code rec_id Lattitude Longitude Location 
94 2016-02-16 14:56:04 32221 301978 57.14066 2.075702  37 
95 2016-02-16 14:56:16 32221 301978 57.14066 2.075702  37 
96 2016-02-16 14:56:27 32221 301978 57.14066 2.075702  37 
97 2016-02-16 14:56:38 32221 301978 57.14066 2.075702  37 
98 2016-02-16 14:56:49 32221 301978 57.14066 2.075702  38 
99 2016-02-16 14:57:01 32221 301978 57.14066 2.075702  38 
100 2016-02-16 14:57:12 32221 301978 57.14066 2.075702  38 
101 2016-02-16 14:57:23 32221 301978 57.14066 2.075702  38 
102 2016-02-16 14:57:34 32221 301978 57.14066 2.075702  38 
103 2016-02-16 14:57:46 32221 301978 57.14066 2.075702  NA 
104 2016-02-16 14:57:57 32221 301978 57.14066 2.075702  NA 
105 2016-02-16 14:58:08 32221 301978 57.14066 2.075702  NA 
106 2016-02-16 14:58:19 32221 301978 57.14066 2.075702  NA 
107 2016-02-16 14:58:31 32221 301978 57.14066 2.075702  NA 
108 2016-02-16 15:03:43 32221 301976 57.141869 2.081576  39 
109 2016-02-16 15:04:06 32221 301976 57.141869 2.081576  39 
110 2016-02-16 15:04:39 32221 301976 57.141869 2.081576  40 
111 2016-02-16 15:05:02 32221 301976 57.141869 2.081576  40 
112 2016-02-16 15:06:09 32221 301976 57.141869 2.081576  41 
113 2016-02-16 15:09:32 32221 301976 57.141869 2.081576  NA

我不知道，如果邊界檢查是正確的（修改f），但它看起來好像你磨你了。加速有多重要？

來源

2016-02-21 20:16:17 ekstroem

謝謝！作品完美。加快不重要！ –

如何計算特定時間段內的事件數

回答

相關問題