R比較兩列上的兩個數據幀並增加第三個數

我有以下問題。在一個數據框中，我每天都會看到客戶。在另一個我有他們做的採購。我感興趣的是他們在任何一天到目前爲止購買了多少物品。我用for循環解決了這個問題，但是想知道是否有更高效的方法？R比較兩列上的兩個數據幀並增加第三個數

讓我們的例子中看到：

# Same customer observed on 10 different occasions 
customers<-data.frame(id=rep(1:10, 10), date=rep(11:20, each=10)) 
purchases<-data.frame(id=c(1,1,4,6,6,6), date=c(12, 14, 12, 9, 13, 17)) 

# I can achieve what I want if I add a cumulative sum column and run a for loop 
purchases$count<-sapply(1:length(purchases$id), function(i) sum(purchases$id[i]==purchases$id[1:i])) 

customers$count<-0 
for(i in 1:nrow(purchases)){ 
    customers[(customers$id==purchases[i, "id"] & customers$date>=purchases[i, "date"]),"count"]<-purchases[i,"count"] 
} 

customers 
    id date count 
1 1 11  0 
2 2 11  0 
3 3 11  0 
4 4 11  0 
5 5 11  0 
6 6 11  1 
7 7 11  0 
8 8 11  0 
9 9 11  0 
10 10 11  0 
11 1 12  1 
12 2 12  0 
13 3 12  0 
14 4 12  1 
. . .  . 
. . .  . 
100 10 20 0

我想知道什麼是做到這一點的更快的方法？

在此先感謝。

來源

2015-04-02 PoorLifeChoicesMadeMeWhoIAm

也許我陷害了錯誤的問題，我對累計數感興趣。 11日的觀察來自9日購買顧客9。 – PoorLifeChoicesMadeMeWhoIAm 2015-07-05 06:31:58

我編輯了我的答案來做一個累計計數，但你的例子仍然看起來不正確。如果客戶1在11日購買，則計數應該等於所需輸出的第一行上的1。 – C8H10N4O2 2015-07-06 13:34:10

我想我在編輯代碼時犯了一個錯誤，我相信在原始版本中曾經是12。 – PoorLifeChoicesMadeMeWhoIAm 2015-07-06 21:35:56

這裏有一個基礎R解決方案 - 但軟件包，如dplyr和data.table也是有用的：

# make sure purchases table is ordered correctly to do cumulative count 
cum_purchases <- cbind(purchases <- purchases[order(purchases$id, purchases$date),], 
         count = with(purchases, ave(id==id, id, FUN=cumsum))) 
cum_purchases 
# id date count 
# 1 1 11    1 
# 2 1 14    2 
# 3 4 12    1 
# 4 6 9    1 
# 5 6 13    2 
# 6 6 17    3 
out <- merge(customers,cum_purchases, all.x=T) # "left join" 
out 
# note that this solution produces NA instead of 0 for no purchases 
# you can change this with: 
out$count[is.na(out$count)] <- 0 
out[order(out$date,out$id),] # produces a correct version of the example output

[R給你很多的方法來計算的東西。 （編輯使用累計數。）

來源

2015-07-04 00:00:42 C8H10N4O2

合併不起作用，因爲不包括在購買日期將等於0.像第20天，客戶1將有0購買，而不是2，6將會有0而不是3. 我測試了代碼並確保有用。輸出應該與執行結束時示例的客戶data.frame相同。 – PoorLifeChoicesMadeMeWhoIAm 2015-07-06 21:50:27

R比較兩列上的兩個數據幀並增加第三個數

回答

相關問題