2014-02-22 24 views
0

我有兩個數據集,我試圖合併在一起。第一個包含具有唯一ID的每個測試對象的信息(按行)。第二組包含每個測試對象的測量結果(列),然而每個對象測量兩次,因此唯一ID讀取「IDa和IDb」。我想找到一種基於唯一ID合併這兩個表格的方法,無論它是測量A還是B.合併R中的兩個表;列名與A和B選項不同

下面是2個數據集的一個小樣本以及一個預期輸出的表格。任何幫助,將不勝感激!

UniqueID  Site  State  Age  Height 
Tree001   FK   OR   23 70 
Tree002   FK   OR   45 53 
Tree003   NM   OR   35 84 


UniqueID Tree001A Tree001B Tree002A Tree002B Tree003A Tree003B 
1996 4  2   
1997 7 8 7  3 
1998 3 2 9 4 7 
1999 11 9 2 12 3 13 
2010 8 8 4 6 11 4 
2011 10 5 6 3 8 9 


UniqueID Tree001A Tree001B Tree002A Tree002B Tree003A Tree003B 
Site FK FK FK FK NM NM 
State OR OR OR OR OR OR 
Age  23 23 45 45 35 35 
Height 70 70 53 53 84 84 
1996 4  2    
1997 7 8 7  3  
1998 3 2 9 4 7  
1999 11 9 2 12 3 13 
2010 8 8 4 6 11 4 
2011 10 5 6 3 8 9 
+0

你能發佈'dput(df1)'和'dput(df2)'嗎? –

+0

我不確定這是什麼意思?抱歉! – KKL234

回答

1

這可以是一種方法。

df1 <- structure(list(UniqueID = structure(1:3, .Label = c("Tree001", 
"Tree002", "Tree003"), class = "factor"), Site = structure(c(1L, 
1L, 2L), .Label = c("FK", "NM"), class = "factor"), State = structure(c(1L, 
1L, 1L), .Label = "OR", class = "factor"), Age = c(23L, 45L, 
35L), Height = c(70L, 53L, 84L)), .Names = c("UniqueID", "Site", 
"State", "Age", "Height"), class = "data.frame", row.names = c(NA, 
-3L)) 


df2 <- structure(list(UniqueID = c(1996L, 1997L, 1998L, 1999L, 2010L, 
2011L), Tree001A = c(4L, 7L, 3L, 11L, 8L, 10L), Tree001B = c(NA, 
8L, 2L, 9L, 8L, 5L), Tree002A = c(2L, 7L, 9L, 2L, 4L, 6L), Tree002B = c(NA, 
NA, 4L, 12L, 6L, 3L), Tree003A = c(NA, 3L, 7L, 3L, 11L, 8L), 
    Tree003B = c(NA, NA, NA, 13L, 4L, 9L)), .Names = c("UniqueID", 
"Tree001A", "Tree001B", "Tree002A", "Tree002B", "Tree003A", "Tree003B" 
), class = "data.frame", row.names = c(NA, -6L)) 


    > df1 
    UniqueID Site State Age Height 
1 Tree001 FK OR 23  70 
2 Tree002 FK OR 45  53 
3 Tree003 NM OR 35  84 
> df2 
    UniqueID Tree001A Tree001B Tree002A Tree002B Tree003A Tree003B 
1  1996  4  <NA>  2  <NA>  <NA>  <NA> 
2  1997  7  8  7  <NA>  3  <NA> 
3  1998  3  2  9  4  7  <NA> 
4  1999  11  9  2  12  3  13 
5  2010  8  8  4  6  11  4 
6  2011  10  5  6  3  8  9 

# Use transpose function to change df1 
df3 <- as.data.frame(t(df1[,-1])) 

colnames(df3) <- df1[,1] 

# Change rownames to UniqueID 
df3$UniqueID <- rownames(df3) 

# ROwnames to numeric 
rownames(df3) <- c(1:4) 

# Modify dataframe so that you have two columns for each subject 
df3 <- df3[,c(4,1,1,2,2,3,3)] 
colnames(df3) <- c("UniqueID", "Tree001A", "Tree001B", "Tree002A", 
        "Tree002B", "Tree003A", "Tree003B") 

# Change classes of columns of df2 to factor 
df2 <- data.frame(sapply(df2,function(x) class(x)<- as.factor(x))) 

# Now combine two data frames 
new <- rbind(df3,df2) 
> new 
    UniqueID Tree001A Tree001B Tree002A Tree002B Tree003A Tree003B 
1  Site  FK  FK  FK  FK  NM  NM 
2  State  OR  OR  OR  OR  OR  OR 
3  Age  23  23  45  45  35  35 
4 Height  70  70  53  53  84  84 
5  1996  4  <NA>  2  <NA>  <NA>  <NA> 
6  1997  7  8  7  <NA>  3  <NA> 
7  1998  3  2  9  4  7  <NA> 
8  1999  11  9  2  12  3  13 
9  2010  8  8  4  6  11  4 
10  2011  10  5  6  3  8  9 
+0

這工作得很好!作爲R的新手,我還有很多東西需要學習,我很欣賞那裏的幫助。 – KKL234

+0

不客氣。我很高興它爲你解決。 –

相關問題