2016-02-12 46 views
1

我有以下的數據集,我想通過end.station.id求和並變換成317行和72列矩陣間間隔

 
> sapply(df, class) 
$end.station.id 
[1] "integer" 

$stoptime 
[1] "POSIXct" "POSIXt" 

$interval 
[1] "POSIXct" "POSIXt" 

> dim(df) 
[1] 8256 3 
> length(unique(df$end.station.id)) 
[1] 317 
> length(unique(df$interval)) 
[1] 72 
> head(df) 
     end.station.id   stoptime   interval 
14785   437 2014-08-18 21:08:36 2014-08-18 21:00:00 
16980   406 2014-08-18 20:34:22 2014-08-18 20:30:00 
20200   372 2014-08-18 22:53:33 2014-08-18 22:50:00 
20935   2000 2014-08-18 22:43:18 2014-08-18 22:40:00 
22610   499 2014-08-18 20:51:28 2014-08-18 20:50:00 
22678   401 2014-08-18 20:05:54 2014-08-18 20:00:00 

我已經無法此使用dplyr

做一個矩陣
 
library(dplyr); 
library(tidyr); 
> matrix % 
+ group_by(end.station.id, interval)%>% 
+ summarise(sum = nrow) %>% 
+ spread(end.station.id, nrow) 
Error: not a vector 

我已經想分配唯一整數每個間隔的,但因爲它在POSIXct格式,數據是當我嘗試提取柱interval和順序將其與順序(X下降,= FALSE)丟失

最後,結果應該類似於這樣的矩陣,儘管填充了每個站點每間隔的總和。

 
> head(m) 
    station_id 2014-08-18 20:00:00 2014-08-18 20:10:00 2014-08-18 20:20:00 
1  302     0     0     0 
2  487     0     0     0 
3  218     0     0     0 
4  465     0     0     0 
5  160     0     0     0 
6  291     0     0     0 
    2014-08-18 20:30:00 2014-08-18 20:40:00 2014-08-18 20:50:00 2014-08-18 21:00:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-18 21:10:00 2014-08-18 21:20:00 2014-08-18 21:30:00 2014-08-18 21:40:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-18 21:50:00 2014-08-18 22:00:00 2014-08-18 22:10:00 2014-08-18 22:20:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-18 22:30:00 2014-08-18 22:40:00 2014-08-18 22:50:00 2014-08-18 23:00:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-18 23:10:00 2014-08-18 23:20:00 2014-08-18 23:30:00 2014-08-18 23:40:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-18 23:50:00 2014-08-19 00:00:00 2014-08-19 00:10:00 2014-08-19 00:20:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-19 00:30:00 2014-08-19 00:40:00 2014-08-19 00:50:00 2014-08-19 01:00:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-19 01:10:00 2014-08-19 01:20:00 2014-08-19 01:30:00 2014-08-19 01:40:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-19 01:50:00 2014-08-19 02:00:00 2014-08-19 02:10:00 2014-08-19 02:20:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-19 02:30:00 2014-08-19 02:40:00 2014-08-19 02:50:00 2014-08-19 03:00:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-19 03:10:00 2014-08-19 03:20:00 2014-08-19 03:30:00 2014-08-19 03:40:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-19 03:50:00 2014-08-19 04:00:00 2014-08-19 04:10:00 2014-08-19 04:20:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-19 04:30:00 2014-08-19 04:40:00 2014-08-19 04:50:00 2014-08-19 05:00:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-19 05:10:00 2014-08-19 05:20:00 2014-08-19 05:30:00 2014-08-19 05:40:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-19 05:50:00 2014-08-19 06:00:00 2014-08-19 06:10:00 2014-08-19 06:20:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-19 06:30:00 2014-08-19 06:40:00 2014-08-19 06:50:00 2014-08-19 07:00:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-19 07:10:00 2014-08-19 07:20:00 2014-08-19 07:30:00 2014-08-19 07:40:00 
1     0     0     0     0 
2     0     0     0     0 
3     0     0     0     0 
4     0     0     0     0 
5     0     0     0     0 
6     0     0     0     0 
    2014-08-19 07:50:00 
1     0 
2     0 
3     0 
4     0 
5     0 
6     0 

回答

1

改線summarize(sum = nrow)summarize(sum = n())和線路spread(end.station.id, nrow)spread(end.station.id, sum)

最後,如果你想與t()頂部intevals轉的結果。

+1

,當我這樣做,我收到以下錯誤:'錯誤:值列「nrow」不input.' –

+0

見更新存在剛纔下一行。 –

+0

優秀,謝謝 –