R中由grepl合併兩個數據幀

假設我有兩個數據幀，使得：R中由grepl合併兩個數據幀

df1<-data.frame(x=c("abc", "def", "ghi", NA), y=1:4) 
df1 
    x y 
1 abc 1 
2 def 2 
3 ghi 3 
4 NA 4 
df2<-data.frame(x=c("a", "i"), z=4:5) 
df2 
    x z 
1 a 4 
2 i 5

我想什麼是grepl DF2的x在合併df1和df2 DF2的x使得期望的結果將是：

df3 
    x y z 
1 abc 1 4 
2 def 2 NA 
3 ghi 3 5 
4 NA 4 NA

實際的數據幀更大，似乎需要幾行。我想知道是否可能有一個簡單的方法。

來源

2017-01-27 David Z

這裏是留下一個一行上在df1.x爲df2.x搜索聯接：

library(sqldf) 

sqldf("select df1.*, df2.z from df1 left join df2 on instr(df1.x, df2.x)")

給予：

 x y z 
1 abc 1 4 
2 def 2 NA 
3 ghi 3 5 
4 <NA> 4 NA

來源

2017-01-27 17:13:38

這裏是一個基R法如果的每一個元素，將工作df2與df1的元素具有單個匹配：

# initialize new varible with NAs 
df1$z <- NA 
# fill in matching indices with df2$z 
df1$z[sapply(df2$x, function(i) grep(i, df1$x, fixed=TRUE))] <- df2$z

sapply(df2$x, function(i) grep(i, df1$x, fixed=TRUE))將貫穿df2$x的每個元素並找到df1$x內的匹配位置，輸出將是一個向量。

爲了使這個強大的非比賽兩者之間，你可以做到以下幾點。在下面的例子中，「j」找不到匹配項。 grep末尾的[1]強制爲NA，而不是默認值integer(0)。

# get indices match with NAs for non-matches 
matches <- unlist(lapply(c("a", "j"), function(i) grep(i, df1$x, fixed=TRUE)[1])) 
matches 
[1] 1 NA

現在，將此與is.na一起用於子集化子載體的子集。

df1$z[matches[!is.na(matches)]] <- df2$z[!is.na(matches)] 
df1 
    x y z 
1 abc 1 4 
2 def 2 NA 
3 ghi 3 NA 
4 <NA> 4 NA

來源

2017-01-27 17:37:45 lmo

R中由grepl合併兩個數據幀

回答

相關問題