1
我有以下的數據集,我想通過end.station.id
求和並變換成317行和72列矩陣間間隔
> sapply(df, class) $end.station.id [1] "integer" $stoptime [1] "POSIXct" "POSIXt" $interval [1] "POSIXct" "POSIXt" > dim(df) [1] 8256 3 > length(unique(df$end.station.id)) [1] 317 > length(unique(df$interval)) [1] 72 > head(df) end.station.id stoptime interval 14785 437 2014-08-18 21:08:36 2014-08-18 21:00:00 16980 406 2014-08-18 20:34:22 2014-08-18 20:30:00 20200 372 2014-08-18 22:53:33 2014-08-18 22:50:00 20935 2000 2014-08-18 22:43:18 2014-08-18 22:40:00 22610 499 2014-08-18 20:51:28 2014-08-18 20:50:00 22678 401 2014-08-18 20:05:54 2014-08-18 20:00:00
我已經無法此使用dplyr
library(dplyr); library(tidyr); > matrix % + group_by(end.station.id, interval)%>% + summarise(sum = nrow) %>% + spread(end.station.id, nrow) Error: not a vector
我已經想分配唯一整數每個間隔的,但因爲它在POSIXct格式,數據是當我嘗試提取柱interval
和順序將其與順序(X下降,= FALSE)丟失
最後,結果應該類似於這樣的矩陣,儘管填充了每個站點每間隔的總和。
> head(m) station_id 2014-08-18 20:00:00 2014-08-18 20:10:00 2014-08-18 20:20:00 1 302 0 0 0 2 487 0 0 0 3 218 0 0 0 4 465 0 0 0 5 160 0 0 0 6 291 0 0 0 2014-08-18 20:30:00 2014-08-18 20:40:00 2014-08-18 20:50:00 2014-08-18 21:00:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-18 21:10:00 2014-08-18 21:20:00 2014-08-18 21:30:00 2014-08-18 21:40:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-18 21:50:00 2014-08-18 22:00:00 2014-08-18 22:10:00 2014-08-18 22:20:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-18 22:30:00 2014-08-18 22:40:00 2014-08-18 22:50:00 2014-08-18 23:00:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-18 23:10:00 2014-08-18 23:20:00 2014-08-18 23:30:00 2014-08-18 23:40:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-18 23:50:00 2014-08-19 00:00:00 2014-08-19 00:10:00 2014-08-19 00:20:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-19 00:30:00 2014-08-19 00:40:00 2014-08-19 00:50:00 2014-08-19 01:00:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-19 01:10:00 2014-08-19 01:20:00 2014-08-19 01:30:00 2014-08-19 01:40:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-19 01:50:00 2014-08-19 02:00:00 2014-08-19 02:10:00 2014-08-19 02:20:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-19 02:30:00 2014-08-19 02:40:00 2014-08-19 02:50:00 2014-08-19 03:00:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-19 03:10:00 2014-08-19 03:20:00 2014-08-19 03:30:00 2014-08-19 03:40:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-19 03:50:00 2014-08-19 04:00:00 2014-08-19 04:10:00 2014-08-19 04:20:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-19 04:30:00 2014-08-19 04:40:00 2014-08-19 04:50:00 2014-08-19 05:00:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-19 05:10:00 2014-08-19 05:20:00 2014-08-19 05:30:00 2014-08-19 05:40:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-19 05:50:00 2014-08-19 06:00:00 2014-08-19 06:10:00 2014-08-19 06:20:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-19 06:30:00 2014-08-19 06:40:00 2014-08-19 06:50:00 2014-08-19 07:00:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-19 07:10:00 2014-08-19 07:20:00 2014-08-19 07:30:00 2014-08-19 07:40:00 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 2014-08-19 07:50:00 1 0 2 0 3 0 4 0 5 0 6 0
,當我這樣做,我收到以下錯誤:'錯誤:值列「nrow」不input.' –
見更新存在剛纔下一行。 –
優秀,謝謝 –