我正在計算每月唯一的「新」用戶數。新用戶以前沒有出現過(自開始以來)我也在統計上個月沒有出現的唯一身份用戶的數量。計算新的值不會更早發生並且不會發生在最後一組中
原始數據看起來像
library(dplyr)
date <- c("2010-01-10","2010-02-13","2010-03-22","2010-01-11","2010-02-14","2010-03-23","2010-01-12","2010-02-14","2010-03-24")
mth <- rep(c("2010-01","2010-02","2010-03"),3)
user <- c("123","129","145","123","129","180","180","184","145")
dt <- data.frame(date,mth,user)
dt <- dt %>% arrange(date)
dt
date mth user
1 2010-01-10 2010-01 123
2 2010-01-11 2010-01 123
3 2010-01-12 2010-01 180
4 2010-02-13 2010-02 129
5 2010-02-14 2010-02 129
6 2010-02-14 2010-02 184
7 2010-03-22 2010-03 145
8 2010-03-23 2010-03 180
9 2010-03-24 2010-03 145
答案應該看起來像
new <- c(2,2,2,2,2,2,1,1,1)
totNew <- c(2,2,2,4,4,4,5,5,5)
notLastMonth <- c(2,2,2,2,2,2,2,2,2)
tmp <- cbind(dt,new,totNew,notLastMonth)
tmp
date mth user new totNew notLastMonth
1 2010-01-10 2010-01 123 2 2 2
2 2010-01-11 2010-01 123 2 2 2
3 2010-01-12 2010-01 180 2 2 2
4 2010-02-13 2010-02 129 2 4 2
5 2010-02-14 2010-02 129 2 4 2
6 2010-02-14 2010-02 184 2 4 2
7 2010-03-22 2010-03 145 1 5 2
8 2010-03-23 2010-03 180 1 5 2
9 2010-03-24 2010-03 145 1 5 2
有你想要的新的,totnew和notLastMonth的總人數將在該「用戶」表...的理由似乎很奇怪將其存儲在用戶記錄中。獲取新客戶很簡單,但按用戶分組,然後變更一個新列,讓他們看到他們出現的第一個月。然後按新列分組,然後統計用戶。 – Shape