2015-11-02 81 views
-2

我有一個文件列表,並且我編寫了一個函數來處理每個文件並返回兩列(「名稱」和「值」)。將多個文件合併成一個數據框並將文件名分配給每個文件名

file_list <- list.files(pattern=".txt") 
sample_name <- sub (".*?lvl.(.*?).txt","\\1",file_list) 

for (i in 1:length(file_list)){ 
x<- cleanMyData(file_list[i]) # this function returns a two column data 
#then I want to merge all these processed data into one dataframe. Merge all "value" column based on the "name" column 
# at the same time I want to put the file name in the corresponding column name. I already process the file name and put them into sample_name 
} 

更清楚,這是我的,例如處理數據:

file: apple.txt 
name value 
A  12 
B  13 
C  14 

file: pear.txt 
name value 
A  15 
B  14 
C  20 
D  21 

期望輸出:

Apple Pear 
A 12 15 
B 13 14 
C 14 20 
+0

你可能只是'綁定'兩個數據幀,但假設這些行排列完全正確。另一種選擇是將'name'列上的兩個數據框'合併()'。 –

回答

0

你可以嘗試

fns <- c("apple.txt", "pear.txt") 
(df <- 
Reduce(function(...) merge(..., all=F), 
     lapply(
     seq(fns), function(x) { 
      read.table(fns[x], 
         header=TRUE, 
         col.names = c("name", 
            tools::file_path_sans_ext(fns)[x])) 
     }) 
) 
) 
# name  apple  pear 
# 1 A  12  15 
# 2 B  13  14 
# 3 C  14  20 

要大寫第一個字符,你可以使用事端摹狀

sub("\\b(\\w)", "\\U\\1", fns, perl=TRUE) 

(見?sub

爲了擺脫name列,你可以使用subset(df, select = -name)

相關問題