我需要更新一個有1000行的問題的電子表格。過濾多個條件的數據幀
我有兩個數據集:
DF
CompanyID1 TMC1
ABC company QBT
BCD company G W TMC
jb hi fi QBT
ABC company GW TMC
FB Company AMEX
LL company AMEX
j k QBT
k. l company TP oil
1 to 1 lts TP oil
2 in 1 pty ltd. AMEX
DF2
DRA CompanyID2 TMC2 Status
11 2 in 1 pty ltd. AMEX sent
12 1 to 1 lts TP oil produce
13 BCD company ACE sent
14 k. l company TP oil sent
15 jb hi fi QBT produce
16 ABC company QBT sent
17 j k QBT sent
18 FB Company AMEX sent
19 facebook pty QBT sent
20 2 in 1 pty ltd. AMEX produce
我所試圖實現df2$CompanyID2
首先找到df$CompanyID1
值,如果有一個匹配,那麼如果其df$TMC1
匹配df2$TMC2
然後它必須有df2$status=='sent'
然後在創建一個新列並返回df2$DRA
值;如果df2$status=='produce'
然後df$new
應該有 '刪除'
例
「ABC公司」 從df2$CompanyID2
存在df1$CompanyID1
。 ABC公司的df$TMC1
匹配df2$TMC2
和df2$status=='sent'
。因此,df$new <- 16
我將非常感謝您的幫助。這將節省大量的時間,我可以用於其他生產目的。由於
dput(DF1)
structure(list(Company.ID1 = structure(c(3L, 4L, 7L, 3L, 5L,
9L, 6L, 8L, 1L, 2L), .Label = c("1 to 1 lts", "2 in 1 pty ltd.",
"ABC company", "BCD company", "FB Company", "j k ", "jb hi fi",
"k. l company", "LL company"), class = "factor"), TMC1 = structure(c(4L,
2L, 4L, 3L, 1L, 1L, 4L, 5L, 5L, 1L), .Label = c("AMEX", "G W TMC",
"GW TMC", "QBT", "TP oil"), class = "factor")), .Names = c("Company.ID1",
"TMC1"), class = "data.frame", row.names = c(NA, -10L))
dput(DF2)
structure(list(DRA = 11:20, Company.ID2 = structure(c(2L, 1L,
4L, 9L, 8L, 3L, 7L, 6L, 5L, 2L), .Label = c("1 to 1 lts", "2 in 1 pty ltd.",
"ABC company", "BCD company", "facebook pty", "FB Company", "j k ",
"jb hi fi", "k. l company"), class = "factor"), TMC2 = structure(c(2L,
4L, 1L, 4L, 3L, 3L, 3L, 2L, 3L, 2L), .Label = c("ACE", "AMEX",
"QBT", "TP oil"), class = "factor"), Status = structure(c(2L,
1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L), .Label = c("produce", "sent"
), class = "factor")), .Names = c("DRA", "Company.ID2", "TMC2",
"Status"), class = "data.frame", row.names = c(NA, -10L))
#
for (i in 1:nrow(df1))
{
if(df1$Company.ID1[i]==df2$Company.ID2[i] & df1$TMC1[i]==df2$TMC2[i] & df2$Status[i]=='sent')
data1$new[i]<- 'sent'
}else{ data1$new<- 'delete'}
但是可能有超過1家公司從df1$Company.ID1
在df2$Company.ID2
同名並且它們也可以在不同的行中。
我的預期輸出將以下內容:
- 從
df1$Company.ID1
匹配X公司名稱df2$Company.ID2
- 如果匹配檢查X公司的
data1$TMC1
比賽df2df2$TMC2
- 如果1 & 2爲真,則檢查其狀態的公司x從
df2$Status=='sent'
- 如果它是TRUE,那麼創建一個新的列df1 $ new並獲得DRA編號
df$DRA
,並存儲爲X公司
感謝
@pierre lafortune謝謝 – Chemjong