我正在運行中的R的碼,其中樣品的如下是用小的數據集 -流汗先前值的累加和除所述第一值
library(plyr)
Ex<-structure(list(X1 = c(-36.8598, -37.1726, -36.4343, -36.8644,
-37.0599, -34.8818, -31.9907, -37.8304,
-34.3367, -31.2984, -33.5731),
X2 = c(64.26, 63.085, 66.36, 61.08, 61.57, 65.04, 72.69, 63.83,
67.555, 76.06, 68.61),
Y1 = c(493.81544, 493.81544, 494.54173,
494.61364, 494.61381, 494.38717, 494.64122, 493.73265, 494.04246,
494.92989, 494.98384),
Y2 = c(489.704166, 489.704166, 490.710962,
490.653212, 490.710612, 489.822928,
488.160904, 489.747776, 490.600579,
488.946738, 490.398958),
Y3 = c(19L, 19L, 19L, 23L, 30L,43L,43L,2L, 58L, 47L, 61L),
date = c("2013-06-01","2013-06-02","2013-06-03","2013-06-04",
"2013-06-05","2013-06-06","2013-06-07","2013-06-08",
"2013-06-09","2013-06-10","2013-06-11")),
.Names = c("X1", "X2", "Y1", "Y2", "Y3", "date"),
row.names = c(NA, 11L), class = "data.frame")
Ex <- arrange(Ex, Y3)
Ex$Dup <- as.numeric(duplicated(Y3))
Ex$Dup_rev <- as.numeric(duplicated(Y3,fromLast=TRUE))
##Testing If Else
attach(Ex)
Ex$X5 <- 0
for(i in 1:length(Y3))
{
if (Ex$Dup[i]==0 & Ex$Dup_rev[i]==0)
{
Ex$X5[i]=Y2[i]
} else if(Ex$Dup[i]==0)
{
Ex$X5[i]=Y2[i]
}else
{Ex$X5[i]=Y2[i] + X5[i-1]}
}
這樣做是除非列的值Y3是第一次出現在數據集中,對於Y3的每一行,我們需要創建一個列X5,它是前一個Y2的累加和。 由於我的數據非常龐大(大約110k行數據),這段代碼花了很多時間來執行。有沒有更簡單的方法來執行相同的代碼?
X1 X2 Y1 Y2 Y3 date Dup Dup_rev X5
1 -37.8304 63.830 493.7326 489.7478 2 2013-06-08 0 0 489.7478
2 -36.8598 64.260 493.8154 489.7042 19 2013-06-01 0 1 489.7042
3 -37.1726 63.085 493.8154 489.7042 19 2013-06-02 1 1 1469.1125
4 -36.4343 66.360 494.5417 490.7110 19 2013-06-03 1 0 1470.1193
5 -36.8644 61.080 494.6136 490.6532 23 2013-06-04 0 0 490.6532
你可以發佈你想要的輸出嗎?我從運行你的代碼得到的輸出與你正在尋找的內容的描述不匹配 –
我的錯誤,例如a = c(1,2,3,4,5),我想創建b這樣的那b [i] = a [i] + b [i-1]。其中b [1] = 0. – RHelp
您可以使用示例中的變量名嗎?所以在剛剛發佈的輸出中,第三行中'X5'的值是'1469.1125'。從你的解釋中可以看出它應該等於'Y2 [3] + X5 [2]'這是'489.7042 + 489.7042 = 979.4084'。對不起,如果我錯過了一些非常明顯的東西,但我無法弄清楚'1469.1125'來自何處 –