合併和聚合多個數據框架

我有一組.csv文件，每個文件都包含相同數量的行和列。每個文件包含一些測試對象，其特徵在於A，B，C的觀測（列「值」），並採取類似於以下形式：合併和聚合多個數據框架

A B C value 
1 1 1 0.5 
1 1 2 0.6 
1 2 1 0.1 
1 2 2 0.2 
. . . .

假設每個文件被讀入一個單獨的數據幀。將這些數據幀組合到單個數據框中的最有效方式是「value」列，其中「value」列包含一些手段，或者一般來說，某個測試對象的所有「值」行上的某些函數調用的結果。列A，B和C在所有文件中都是不變的，並且可以被視爲這些觀察的關鍵。

謝謝你的幫助。

來源

2014-03-03 voo

這應該是很容易的，假設文件都以同樣的方式排列：

dflist <- lapply(dir(pattern='csv'), read.csv) 
# row means: 
rowMeans(do.call('cbind', lapply(dflist, `[`, 'value'))) 
# other function `myfun` applied to each row: 
apply(do.call('cbind', lapply(dflist, `[`, 'value')), 1, myfun)

來源

2014-03-03 10:42:07 Thomas

這裏的情況下另一種解決方案，它的鍵可以按任意順序排列，或可能丟失：

n <- 10 # of csv files to create 
obs <- 10 # of observations 
# create test files 
for (i in 1:n){ 
    df <- data.frame(A = sample(1:3, obs, TRUE) 
       , B = sample(1:3, obs, TRUE) 
       , C = sample(1:3, obs, TRUE) 
       , value = runif(obs) 
       ) 
    write.csv(df, file = tempfile(fileext = '.csv'), row.names = FALSE) 
} 


# read in the data 
input <- lapply(list.files(tempdir(), "*.csv", full.names = TRUE) 
    , function(file) read.csv(file) 
    ) 

# put dataframe together and the compute the mean for each unique combination 
# of A, B & C assuming that they could be in any order. 
input <- do.call(rbind, input) 
result <- lapply(split(input, list(input$A, input$B, input$C), drop = TRUE) 
    , function(sect){ 
     sect$value[1L] <- mean(sect$value) 
     sect[1L, ] 
    } 
) 

# create output DF 
result <- do.call(rbind, result) 
result

來源

2014-03-03 13:09:22

合併和聚合多個數據框架

回答

相關問題