2012-06-08 48 views
3

我有這樣三個不同的數據幀這樣:R-最好的方式來增加(增加)在不同的數據列幀

V1.x<-c(1,2,3,4,5) 
V2.x<-c(2,2,7,3,1) 
V3.x<-c(2,4,3,2,9) 
D1<-data.frame(ID=c("A","B","C","D","E"),V1.x=V1.x,V2.x=V2.x,V3.x=V3.x) 

V1.y<-c(2,3,3,3,5) 
V2.y<-c(1,2,3,3,5) 
V3.y<-c(6,4,3,2,2) 
D2<-data.frame(ID=c("A","B","C","D","E"),V1.y=V1.y,V2.y=V2.y,V3.y=V3.y) 

V1<-c(3,2,4,4,5) 
V2<-c(3,7,3,4,5) 
V3<-c(5,4,3,6,3) 
D3<-data.frame(ID=c("A","B","C","D","E"),V1=V1,V2=V2,V3=V3) 

我想補充了所有的V1列,所有的V2列和所有的V3列

V1_Add<-D1$V1.x+D2$V1.y+D3$V1 
V2_Add<-D1$V2.x+D2$V2.y+D3$V2 
V3_Add<-D1$V3.x+D2$V3.y+D3$V3 

工作得很好,以獲得單獨列總和,但在真實數據的列數從V1去:V80所以這將是偉大,不是有單獨輸入每個collumn 。此外,我寧願與一個數據幀,將包含所有像這樣的最終款項落得:

ID V1 V2 V3 
1 A 6 6 13 
2 B 7 11 12 
3 C 10 13 9 
4 D 11 10 10 
5 E 15 11 14 

回答

2
library(reshape2) 
library(plyr) 

# First let's standardize column names after ID so they become V1 through Vx. 
# I turned it into a function to make this easy to do for multiple data.frames 
standardize_col_names <- function(df) { 
# First column remains ID, then I name the remaining V1 through Vn-1 
# (since first column is taken up by the ID) 
names(df) <- c("ID", paste("V",1:(dim(df)[2]-1),sep="")) 
return(df) 
} 

D1 <- standardize_col_names(D1) 
D2 <- standardize_col_names(D2) 
D3 <- standardize_col_names(D3) 

# Next, we melt the data and bind them into the same data.frame 
# See one example with melt(D1, id.vars=1). I just used rbind to combine those 
melted_data <- rbind(melt(D1, id.vars=1), melt(D2, id.vars=1), melt(D3, id.vars=1)) 
# note that the above step can be folded into the function as well. 
# Then you throw all the data.frames into a list and ldply through this function. 

# Finally, we cast the data into what you need which is the sum of the columns 
dcast(melted_data, ID~variable, sum) 
    ID V1 V2 V3 
1 A 6 6 13 
2 B 7 11 12 
3 C 10 13 9 
4 D 11 10 10 
5 E 15 11 14 



# Combined everything above more efficiently : 

    standardize_df <- function(df) { 
    names(df) <- c("ID", paste("V",1:(dim(df)[2]-1),sep="")) 
    return(melt(df, id.vars = 1)) 
    } 

    all_my_data <- list(D1,D2,D3) 
    melted_data <- ldply(all_my_data, standardize_df) 
    summarized_data <- dcast(melted_data, ID~variable, sum) 
+0

我喜歡效率較低的答案,因爲我自己並不是很高效,而且更易於掌握:)感謝您的回答! – Vinterwoo

+0

我寫了「效率較低」的一個,以便您可以看到所有的部分以及它們的工作方式。我稍後將它們結合起來,向您展示它將如何協同工作。很高興它是有用的:) – Maiasaura

2

這是類似的東西,你想要什麼?

D.Add <- data.frame(D1[,1],(D1[,-1]+D2[,-1]+D3[,-1])) 
colnames(D.Add)<-colnames(D3) 
+0

這真的很整潔,格倫。我知道這對矩陣是可能的,但看到它用數據框完成是一個驚喜。 –

2

這裏有一個辦法,可能是矯枉過正,但應該是相當推廣到任意數量的列和任意數量的「指數」列的爲好。它確實假設你的所有數據框都有相同的列數,並且它們的順序是正確的。首先,從所有data.frames中創建一個列表對象。我參考了this question來編程。

ClassFilter <- function(x, class) inherits(get(x), "data.frame") 
Objs <- Filter(ClassFilter, ls()) 
Objs <- lapply(Objs, "get") 

接下來,我寫了一個函數使用Reduce所有數值列加在一起,然後在結束非數字列縫合回在一起:

FUN <- function(x){ 
    colsToProcess <- lapply(x, function(y) y[, unlist(sapply(y, is.numeric))]) 
    result <- Reduce("+", colsToProcess) 
    #Get the non numeric columns 
    nonNumericCols <- x[[1]] 
    nonNumericCols <- nonNumericCols[, !(unlist(sapply(nonNumericCols, is.numeric)))] 
    return(data.frame(Index = nonNumericCols, result)) 
} 

最後,在行動:

> FUN(Objs) 
    Index V1.x V2.x V3.x 
1  A 6 6 13 
2  B 7 11 12 
3  C 10 13 9 
4  D 11 10 10 
5  E 15 11 14 
0

這是什麼,只是加起來整個塊? :

D1[,2:4] + D3[,2:4] + D2[,2:4] 

...結果...

V1.x V2.x V3.x 
1 6 6 13 
2 7 11 12 
3 10 13 9 
4 11 10 10 
5 15 11 14 

它假設,所有的變量都以相同的順序,但在其他方面應該很好地工作。

相關問題