2017-03-06 16 views
-2

我有兩個數據框,看起來像:如何在數據框中添加新行?

DF1:

 V1  V2  V3  V4 
rs200140498 chr1 861315 GG 
rs371217242 chr1 861329 AA 
rs200686669 chr1 861349 CC 
rs370046315 chr1 861357 CC 
rs374110379 chr1 861521 GG 
rs74045401 chr1 861530 GG 
rs377418023 chr1 865394 CC 
rs79027658 chr1 865438 CC 
rs202189913 chr1 865488 AA 
rs370992396 chr1 865543 GG 

和DF2:

 V1  V2  V3  V4 
rs200140498 chr1 861315 GG 
rs200686669 chr1 861349 CC 
rs370046315 chr1 861357 CC 
rs74045401 chr1 861530 GG 
rs377418023 chr1 865394 CC 
rs202189913 chr1 865488 AA 
rs370992396 chr1 865543 GG 

我想比較它和獲得新的數據幀:

  V1  V2  V3  V4 
rs200140498 chr1 861315 GG 
rs371217242 chr1 861329 -- 
rs200686669 chr1 861349 CC 
rs370046315 chr1 861357 CC 
rs374110379 chr1 861521 -- 
rs74045401  chr1 861530 GG 
rs377418023 chr1 865394 CC 
rs79027658  chr1 865438 -- 
rs202189913 chr1 865488 AA 
rs370992396 chr1 865543 GG 

任何人都可以幫助我嗎?

回答

1

試用一下這個:

library(dplyr) #you need to install and load the dplyr package 

df3 <- left_join(df1,df2, by=c("V1", "V2", "V3")) 
df3 <- df3[,-4] 
View(df3) 

另外,如果你只需要上的差異,那麼我建議anti_join功能:

df4 <- anti_join(df1,df2, by=c("V1", "V2", "V3")) 
View(df4) 

如果你需要的--代替NA值,那麼使用這個:

df3$V4.y <- replace(df3$V4.y, is.na(df3$V4.y), "--")