我的數據結構如下:如何在dplyr中正確設置排列和分組?
Athletes = c("Gus", "Hudson", "Bobby", "Tom")
set.seed(400)
RawData <- data.frame(Name = rep((Athletes), each = 400),
Quarter = as.numeric(rep(1:4, each = 100)),
Sample = as.numeric(rep(1:100, each = 1)),
X = runif(400, 26, 30),
Y = runif(400, 12, 16))
祝在每個Sample
每Quarter
來計算位移,每個X和Y對,對於每個Athlete
。要做到這一點,我已經安裝了下面的代碼:
DistanceOutput <- RawData %>%
arrange(Name, Sample, Quarter) %>%
group_by(Name, Quarter) %>%
mutate(lagX = lag(X, order_by=Sample), lagY = lag(Y, order_by=Sample)) %>%
rowwise() %>%
mutate(Distance = dist(matrix(c(X,Y,lagX,lagY),nrow=2,byrow=TRUE))) %>%
select(-lagX, -lagY)
然而,這會返回一個data.frame
該結構如下:
> head(DistanceOutput, n=10)
Source: local data frame [10 x 6]
Name Quarter Sample X Y Distance
(fctr) (dbl) (dbl) (dbl) (dbl) (dbl)
1 Bobby 1 1 27.82656 13.85830 NA
2 Bobby 2 1 27.37298 15.67940 NA
3 Bobby 3 1 28.74274 12.25703 NA
4 Bobby 4 1 26.63564 13.07924 NA
5 Bobby 1 2 26.32446 12.64722 1.929508
6 Bobby 2 2 26.88957 14.52096 NA
7 Bobby 3 2 27.53932 15.57959 3.533781
8 Bobby 4 2 28.03031 12.70763 1.443328
9 Bobby 1 3 29.68239 13.82739 3.559287
10 Bobby 2 3 29.43869 12.60890 3.186531
相反,我寧願我的數據是設置如下:
> head(DistanceOutput, n=3)
Source: local data frame [10 x 6]
Name Quarter Sample X Y Distance
(fctr) (dbl) (dbl) (dbl) (dbl) (dbl)
1 Bobby 1 1 27.82656 13.85830 NA
2 Bobby 1 2 26.32446 12.64722 1.929508
3 Bobby 1 3 29.68239 13.82739 3.559287
我怎麼正確設置GROUP_BY並安排內dplyr
陳述,以正確反映我期望的輸出?
謝謝。
道歉,謝謝你通知我。 – user2716568