1
我有兩個簡單的數據框如下。我想使用dplyr和tidyverse來查找第二個數據框(Df2)的「任務2」中不在第一個數據幀(Df)的「任務」中的類別。我想爲此使用dplyr的「setdiff」函數。此外,我想保留第二個數據幀(Df2)的「時間」列中的相應時間。合併Dplyr加入並將操作設置爲自定義函數
因此,最終產品應包括兩行,一個用於客戶端「Chris」的「鐵襯衫」,總時間爲30,另一個客戶端爲「Eric」,帶有「購買雜貨」相應的時間爲8.
我也想刪掉日期欄。
我在想這樣做的一種方法是使用dplyr的「setdiff」函數(我意識到Task和Task2列名必須被改變,以便它們匹配)分離出兩行,然後重新加入帶連接功能的總時間。
最後,我想這是一個自定義函數,因爲我將不得不重複執行此任務。我想要一個像「差異(Df1,Df2)」這樣的函數......所以我可以輸入兩個數據框,並得到結果。
我希望這不是要求太多!我對自定義函數很陌生,特別是包含dplyr和管道的函數。
希望有人能幫助我!
CaseWorker<-c("John","John","Kim")
Client<-c("Chris","Chris","Eric")
Task<-c("Feed cat","Make dinner","Do homework")
Date<-c("10/27/2016","09/22/2016","10/11/2016")
Df<-data.frame(CaseWorker,Client,Date,Task)
第二數據框...
CaseWorker<-c("John","John","John","John","John","John","John","John","John",
"John","Kim","Kim","Kim")
Client<-c("Chris","Chris","Chris","Chris","Chris","Chris","Chris","Chris","Chris","Chris","Eric","Eric","Eric")
Date<-c("11/10/2016","10/10/2016","11/13/2016","09/18/2016","11/11/2016","09/19/2016","08/08/2016","10/10/2016","08/05/2016","11/12/2016","09/09/2016","11/11/2016","09/10/2016")
Task2<-c("Feed cat","Feed cat","Feed cat","Feed cat","Feed cat","Make dinner","Make dinner","Make dinner","Iron shirt","Iron shirt","Do homework",
"Do homework","Buy groceries")
Time<-c(20,34,11,10,5,6,55,30,20,10,12,10,8)
Df2<-data.frame(CaseWorker,Client,Date,Task2,Time)
謝謝!解決方案比我想象的簡單得多,而且效果很好。由於它只有三行,我認爲自定義函數不是必需的,但出於好奇,我仍然想知道如何實現。也許稱爲「DiffGo(Df1,Df2)?這是很容易做到的事情嗎? – Mike