合併數據幀合計R中相同列的值

我有3個數據幀（行：站點，列：物種名稱）物種丰度站點內。行號相同，但列號不同，因爲並非所有的物種都在三個數據幀中。我想將它們合併成一個數據框，其中彙總了豐富的相同物種。例如：合併數據幀合計R中相同列的值

data.frame1

 Sp1 Sp2 Sp3 Sp4 
site1 1 2 3 1 
site2 0 2 0 1 
site3 1 1 1 1

data.frame2

 Sp1 Sp2 Sp4 
site1 0 1 2 
site2 1 2 0 
site3 1 1 1

data.frame3

 Sp1 Sp2 Sp5 Sp6 
site1 0 1 1 1  
site2 1 1 1 5 
site3 2 0 0 0

我想擁有的是一樣的東西：

 Sp1 Sp2 Sp3 Sp4 Sp5 Sp6 
site1 1 4 3 3 1 1 
site2 2 5 0 1 1 5 
site3 4 2 1 2 0 0

我想我必須使用合併，但到目前爲止，我的嘗試未能得到我想要的。

任何幫助表示讚賞。

來源

2013-04-15 eugenego

也許'聚合'比'merge'更好？ –

我會使用plyr的rbind.fill這樣的：

pp <- cbind(names=c(rownames(df1), rownames(df2), rownames(df3)), 
         rbind.fill(list(df1, df2, df3))) 

# names Sp1 Sp2 Sp3 Sp4 Sp5 Sp6 
# 1 site1 1 2 3 1 NA NA 
# 2 site2 0 2 0 1 NA NA 
# 3 site3 1 1 1 1 NA NA 
# 4 site1 0 1 NA 2 NA NA 
# 5 site2 1 2 NA 0 NA NA 
# 6 site3 1 1 NA 1 NA NA 
# 7 site1 0 1 NA NA 1 1 
# 8 site2 1 1 NA NA 1 5 
# 9 site3 2 0 NA NA 0 0

然後，骨料與plyr'sddply如下：

ddply(pp, .(names), function(x) colSums(x[,-1], na.rm = TRUE)) 
# names Sp1 Sp2 Sp3 Sp4 Sp5 Sp6 
# 1 site1 1 4 3 3 1 1 
# 2 site2 2 5 0 1 1 5 
# 3 site3 4 2 1 2 0 0

來源

2013-04-15 15:35:38 Arun

我有一個解決方案，我保證它不是這個優雅。 +1 –

完美運作！不幸的是不能投票：（ – eugenego

@eugenego你可以標記最好回答問題的解決方案旁邊的複選標記。 –

到阿倫的回答另：創建一個 '模板'陣列，您需要的所有列

Rgames> bbar<-data.frame('one'=rep(0,3),'two'=rep(0,3),'three'=rep(0,3)) 
Rgames> bbar 
    one two three 
1 0 0 0 
2 0 0 0 
3 0 0 0

然後，給每個數據幀的像

Rgames> bar1<-data.frame('one'=c(1,2,3),'two'=c(4,5,6)) 
Rgames> bar1 
    one two 
1 1 4 
2 2 5 
3 3 6

創建一個擴展數據幀：

Rgames> newbar1<-bbar 
Rgames> for (jj in names(bar)) newbar1[[jj]]<-bar[[jj]] 
Rgames> newbar1 
    one two three 
1 1 4 0 
2 2 5 0 
3 3 6 0

然後總結所有這些擴展的數據幀。笨拙但簡單。

來源

2013-04-15 15:45:02

另一種選擇是從reshape2使用melt/cast。下面是一個不復雜的例子：

df1 <- read.table(header=T, text=" 
    Sp1 Sp2 Sp3 Sp4 
    site1 1 2 3 1 
    site2 0 2 0 1 
    site3 1 1 1 1") 

df2 <- read.table(header=T, text=" 
     Sp1 Sp2 Sp4 
site1 0 1 2 
site2 1 2 0 
site3 1 1 1") 

df3 <- read.table(header=T, text=" 
     Sp1 Sp2 Sp5 Sp6 
site1 0 1 1 1  
site2 1 1 1 5 
site3 2 0 0 0") 

df1$site <- rownames(df1) 
df2$site <- rownames(df2) 
df3$site <- rownames(df3) 

DF <- rbind(melt(df1,id="site"),melt(df2,id="site"),melt(df3,id="site")) 
dcast(data=DF,formula=site ~ variable,fun.aggregate=sum) 

    site Sp1 Sp2 Sp3 Sp4 Sp5 Sp6 
1 site1 1 4 3 3 1 1 
2 site2 2 5 0 1 1 5 
3 site3 4 2 1 2 0 0

總之，我們使用的網站指定爲一個額外的變量，並且每個數據幀轉換爲長格式，隨後將它們連接成一個單一的數據幀。後者包含長格式的所有值。用dcast我們創建你需要的數據框，站點在行中（公式的左側），變量在列中（公式的右側）。求和函數用於生成多個單元格的變量。

當然，該代碼可以擴展到更通用的情況下循環或*應用函數。

來源

2013-04-15 15:49:59

添加到可用選項中，這裏還有兩個與base R一起使用的選項。（那種）寬聚集

temp <- cbind(df1, df2, df3) 
temp 
#  Sp1 Sp2 Sp3 Sp4 Sp1 Sp2 Sp4 Sp1 Sp2 Sp5 Sp6 
# site1 1 2 3 1 0 1 2 0 1 1 1 
# site2 0 2 0 1 1 2 0 1 1 1 5 
# site3 1 1 1 1 1 1 1 2 0 0 0 
sapply(unique(colnames(temp)), 
     function(x) rowSums(temp[, colnames(temp) == x, drop = FALSE])) 
#  Sp1 Sp2 Sp3 Sp4 Sp5 Sp6 
# site1 1 4 3 3 1 1 
# site2 2 5 0 1 1 5 
# site3 4 2 1 2 0 0

第二個選項：

第一選項半寬，長，寬

概念上，這類似於馬克西姆。 K的答案：以長形式獲取數據，並且使操作變得更容易：

> temp1 <- t(cbind(df1, df2, df3)) 
> # You'll get a warning in the next step 
> # Safe to ignore though... 
> temp2 <- data.frame(var = rownames(temp), stack(data.frame(temp))) 
Warning message: 
In data.row.names(row.names, rowsi, i) : 
    some row.names duplicated: 5,6,7,8,9 --> row.names NOT used 
> xtabs(values ~ ind + var, temp2) 
     var 
ind  Sp1 Sp2 Sp3 Sp4 Sp5 Sp6 
    site1 1 4 3 3 1 1 
    site2 2 5 0 1 1 5 
    site3 4 2 1 2 0 0

來源

2013-04-15 19:11:47 A5C1D2H2I1M1N2O1R2T1

合併數據幀合計R中相同列的值

回答

相關問題