2016-08-01 136 views
1

我想匹配2列表中的值,只有列表中的變量名稱相同。我希望結果是一個列表,長列表的長度填充了總匹配數。匹配兩個不等長的列表

jac <- structure(list(s1 = "a", s2 = c("b", "c", "d"), s3 = 5), 
       .Names = c("s1", "s2", "s3")) 

larger <- structure(list(s1 = structure(c(1L, 1L, 1L), .Label = "a", class = "factor"), 
      s2 = structure(c(2L, 1L, 3L), .Label = c("b", "c", "d"), class = "factor"), 
      s3 = c(1, 2, 7)), .Names = c("s1", "s2", "s3"), row.names = c(NA, -3L), class = "data.frame") 

我使用mapply(FUN = pmatch, jac, larger)這給了我正確的,但總不能說我會喜歡下面的格式:

不過,我不認爲pmatch將確保每名匹配所以我寫了一個函數,我仍然有問題:

prodMatch <- function(jac,larger){ 
     for(i in 1:nrow(larger)){ 
      if(names(jac)[i] %in% names(larger[i])){ 
       r[i] <- jac %in% larger[i] 
       r 
      } 
    } 
} 

任何人都可以幫忙嗎?

導致一個不是ohter的倍數另一個數據集:

larger2 <- 
    structure(list(s1 = structure(c(1L, 1L, 1L), class = "factor", .Label = "a"), 
     s2 = structure(c(1L, 1L, 1L), class = "factor", .Label = "c"), 
     s3 = c(1, 2, 7), s4 = c(8, 9, 10)), .Names = c("s1", "s2", 
    "s3", "s4"), row.names = c(NA, -3L), class = "data.frame") 

回答

0

mapply返回匹配索引列表,你可以將其轉換爲數據幀簡單地使用as.data.frame

as.data.frame(mapply(match, jac, larger)) 
# s1 s2 s3 
# 1 1 2 NA 
# 2 1 1 NA 
# 3 1 3 NA 

cbindlarger的結果給出了您的預期:

cbind(larger, 
     setNames(as.data.frame(mapply(match, jac, larger)), 
       paste(names(jac), "result", sep = ""))) 

# s1 s2 s3 s1result s2result s3result 
#1 a c 1  1  2  NA 
#2 a b 2  1  1  NA 
#3 a d 7  1  3  NA 

更新:爲了照顧的情況下,這兩個列表的名稱不匹配,我們可以通過larger循環,它同時是名稱和jac提取內容如下:

as.data.frame(
    mapply(function(col, name) { 
     m <- match(jac[[name]], col) 
     if(length(m) == 0) NA else m # if the name doesn't exist in jac return NA as well 
     }, larger, names(larger))) 

# s1 s2 s3 
#1 1 2 NA 
#2 1 1 NA 
#3 1 3 NA 
+1

如果可能的話,我將處理很多行並希望使用data.table。 data.table與您的建議是否相同? – user3067851

+0

你可以使用'as.data.table'來轉換成'data.table'。 – Psidom

+1

當使用'匹配',即使列的名稱不匹配,將找到匹配的索引,正確?如果我在具有不同名稱的列中具有匹配的值,那可能會出現問題,否? – user3067851