假設我們有在讀該數據幀:串聯當前行,並在新列下一行
df <- data.frame(id = c(rep(1,5), rep(2, 3), rep(3, 4), rep(4, 2)), brand = c("A", "B", "A", "D", "Closed", "B", "C", "D", "D", "A", "B", "Closed", "C", "Closed"))
> df
# id brand
#1 1 A
#2 1 B
#3 1 A
#4 1 D
#5 1 Closed
#6 2 B
#7 2 C
#8 2 D
#9 3 D
#10 3 A
#11 3 B
#12 3 Closed
#13 4 C
#14 4 Closed
我希望創建一個代表品牌欄目從當前行以下變化的新變量行,但這隻能在每個ID號內發生。
創建新列:
df$brand_chg <- ""
該環形正確完成我想做的事:
for (i in 1:nrow(df)) {
j <- i + 1
if(j > nrow(df)) next #to prevent error in very last row
if (df[i,'id'] != df[j, 'id']) next #to skip loop when id changes
df[i,'brand_chg'] <- paste(df[i,'brand'], df[j,'brand'], sep = "->")
#populating concatenation
}
#Results:
# id brand brand_chg
#1 1 A A->B
#2 1 B B->A
#3 1 A A->D
#4 1 D D->Closed
#5 1 Closed
#6 2 B B->C
#7 2 C C->D
#8 2 D
#9 3 D D->A
#10 3 A A->B
#11 3 B B->Closed
#12 3 Closed
#13 4 C C->Closed
#14 4 Closed
然而,與287K行這個循環需要至少10分鐘的表跑步。有誰知道更快的方法來完成這個連接?
謝謝你,我感謝你的見解。
未經測試的287K行(df,ave(brand,id,FUN = function(x)c(paste(head(x,-1),tail(x,-1),sep =' - >'),''))) ' – rawr
我用'with()'得到錯誤,但是當rem時因爲'ave()'函數給了我一個正確連接的列表。謝謝!我將不得不研究它的工作原理。 – gatch