選擇由ID和其他條件給出的其他條件限制的多行

比方說，我有一個數據框，包含每月銷售票據，一個ID客戶端，月份和金額。選擇由ID和其他條件給出的其他條件限制的多行

head(tickets) 
    id_client  month sales 
1 ID87160 2016-01-01 16875 
2 ID18694 2016-01-01 448 
3 ID20624 2016-01-01 16311 
4 ID171683 2016-01-01 314 
5 ID214926 2016-01-01 8889 
6 ID82071 2016-01-01 7479

我有另一個data.frame，我有客戶取消他們的訂閱時刻。

head(stop_being_client) 
    id_client  date 
1 ID235005 2016-03-01 
2 ID50615 2016-04-01 
3 ID72078 2016-03-01 
4 ID129556 2016-01-01 
5 ID204060 2016-04-01 
6 ID57769 2016-01-01

現在我需要檢查的門票表不存在的客戶端沒有認購，即任何寄存器用了一個月的門票比stop_being_client日期更大。

在PostgreSQL會很容易：

SELECT 
    * 
FROM 
    tickets 
JOIN 
    stop_being_client 
ON 
    tickets.id_client = stop_being_client.id_client 
WHERE 
    tickets.month > stop_being_client.date;

但我不知道怎麼做，在R.我試着用這個

tickets[which(
    tickets$id_client %in% stop_being_client$id_client & 
    tickets$month > stop_being_client$date 
    ),]

但我敢肯定的是，結果是不是我想要的，因爲在比較日期時我需要將兩個表中的id_client關聯起來。

編輯：我把一個例子：

這是門票data.frame：

id_client  month sales 
     ID2 2016-01-01 12698 
     ID1 2016-01-01 8626 
     ID2 2016-02-01 18309 
     ID1 2016-02-01 15653 
     ID3 2016-02-01 9642 
     ID3 2016-03-01 18376 
     ID1 2016-03-01 13440 
     ID2 2016-03-01 2322 
     ID1 2016-04-01 19010 
     ID3 2016-04-01 7129 
     ID2 2016-04-01 14694 
     ID2 2016-05-01 4726 
     ID1 2016-05-01 706 
     ID3 2016-05-01 16995 
     ID1 2016-06-01 18743 
     ID3 2016-06-01 16725 
     ID2 2016-07-01 2632

這是表stop_being_client：

id_client  date 
     ID1 2016-03-01 
     ID2 2016-04-01

所以我要檢測門票中應該不存在的那些行：

id_client  month sales 
     ID1 2016-04-01 19010 
     ID2 2016-05-01 4726 
     ID1 2016-05-01 706 
     ID1 2016-06-01 18743 
     ID2 2016-07-01 2632

來源

2017-04-25 Andreu Jiménez

你能做出[再現的示例]（http://stackoverflow.com/questions/5963269/how這是一個很好的可重現的例子），輸出結果如何？ –

這裏是經由基礎R一個想法，

l4 <- split(df, df$id_client) 
do.call(rbind, lapply(Map(cbind, l4, temp = ind1), function(i){ 
            i <- i[i$month > i$temp[!is.na(i$temp)],]; 
            i$temp <- NULL; i 
            })) 


#  id_client  month sales 
#ID1.9  ID1 2016-04-01 19010 
#ID1.13  ID1 2016-05-01 706 
#ID1.15  ID1 2016-06-01 18743 
#ID2.12  ID2 2016-05-01 4726 
#ID2.17  ID2 2016-07-01 2632

來源

2017-04-25 09:57:48 Sotos

我猜你可以使用非平等聯接，對吧？ – akrun

@akrun是的，也是如此。沒有想到它 – Sotos

隨着data.table：

library(data.table) 
setDT(tickets) 
setDT(stop_being_client) 

stop_being_client[tickets, on = .(date < month, id_client==id_client),nomatch=0,.(id_client,month,date,sales)] 

id_client  month  date sales 
1:  ID1 2016-04-01 2016-04-01 19010 
2:  ID2 2016-05-01 2016-05-01 4726 
3:  ID1 2016-05-01 2016-05-01 706 
4:  ID1 2016-06-01 2016-06-01 18743 
5:  ID2 2016-07-01 2016-07-01 2632

來源

2017-04-25 10:14:18

選擇由ID和其他條件給出的其他條件限制的多行

回答

相關問題