在多個數據幀上循環並存儲結果

我想在R中執行至少六個循環步驟。我的數據集是28 files，存儲在一個文件夾中。每個文件有22行（21個個別情況和一行列名稱）和列如下：Id，id，PC1，PC2 ... .PC20。在多個數據幀上循環並存儲結果

我打算：

讀取每個文件成R作爲數據幀
刪除每個數據幀
安排每個數據幀如下命名爲「ID」的第一列：
- 第一欄應爲「身份證」和
- 未來十年列應該是前十位的PC（PC1，PC2，... PC10）
排序依據「ID」每個數據幀（數據幀應該有個人和他們各自的PC的分數相同的順序）
執行由抗議功能在純素包中的所有可能的對的組合中，成對比較（378個組合）
每對在一個對稱的（28 * 28）矩陣的比較結果存儲到將在進一步用於分析

在我能夠做手工，每對數據的（代碼如下）的時刻：

## 1. step 
    ## read files into R as a data frame 
c_2d_hand_1a<-read.table("https://googledrive.com/host/0B90n5RdIvP6qbkNaUG1rTXN5OFE/PC scores, c_2d_hand-1a, Symmetric component.txt",header=T) 
c_2d_hand_1b<-read.table("https://googledrive.com/host/0B90n5RdIvP6qbkNaUG1rTXN5OFE/PC scores, c_2d_hand-1b, Symmetric component.txt",header=T) 

## 2. step 
    ## delete first column named 「Id」 in the each data frame 
c_2d_hand_1a[,1]<-NULL 
c_2d_hand_1b[,1]<-NULL 

## 3. step 
    ## arrange each data frame that have 21 rows and 11 columns (id,PC1,PC2..PC10) 
c_2d_hand_1a<-c_2d_hand_1a[,1:11] 
c_2d_hand_1b<-c_2d_hand_1b[,1:11] 

## 4. step 
    ## sort each data frame according to 「id」 
c_2d_hand_1a<-c_2d_hand_1a[order(c_2d_hand_1a$id),] 
c_2d_hand_1b<-c_2d_hand_1b[order(c_2d_hand_1b$id),] 

## 5. step 
    ## perform pairwise comparison by protest function 
library(permute) 
library(vegan) 
c_2d_hand_1a_c_2d_hand_1b<-protest(c_2d_hand_1a[,2:ncol(c_2d_hand_1a)],c_2d_hand_1b[,2:ncol(c_2d_hand_1b)],permutations=10000) 
summary(c_2d_hand_1a_c_2d_hand_1b)[2] ## or c_2d_hand_1a_c_2d_hand_1b[3]

由於我在R數據處理/操縱一個新手，我的自我學習技能適合於手動執行各個步驟，爲每個數據集鍵入代碼並在當時執行每個成對比較。由於我需要執行這六個步驟378次，所以手動輸入將是完整且耗時的。

我試圖導入文件作爲列表，並嘗試了幾個操作，但我沒有成功。具體來說，使用list.files（），我創建了名爲「probe」的列表。我能夠使用例如探針[2]。另外，我可以通過例如評估欄「Id」來評估。探針[ 2] [ 1]，並通過探針刪除[ 2] [ 1] < -NULL。但是當我嘗試使用for循環時，我被卡住了。

來源

2013-05-29 Newbie_R

您應該使用列表anad'lapply' /'sapply'，以及'list.files'。 –

Tnx Roman，我在我的帖子之前嘗試過。但是，由於缺乏經驗，我在處理這些功能方面遇到了困難。 –

此代碼未經測試，但有一些運氣，它應該工作。抗議（）結果的摘要存儲在列表矩陣中。

# develop a way to easily reference all of the URLs 
url.begin <- "https://googledrive.com/host/0B90n5RdIvP6qbkNaUG1rTXN5OFE/PC scores, " 
url.middle <- c("c_2d_hand-1a", "c_2d_hand-1b") 
url.end <- ", Symmetric component.txt" 
L <- length(url.middle) 

# read in all of the data and save it to a list of data frames 
mybiglist <- lapply(url.middle, function(mid) read.table(paste0(url.begin, mid, url.end), header=TRUE)) 

# save columns 2 to 12 in each data frame and order by id 
mybiglist11cols <- lapply(mybiglist, function(df) df[order(df$id), 2:12]) 

# get needed packages 
library(permute) 
library(vegan) 

# create empty matrix of lists to store results 
results <- matrix(vector("list", L*L), nrow=L, ncol=L) 
# perform pairwise comparison by protest function 
for(i in 1:L) { 
for(j in 1:L) { 
    df1 <- mybiglist11cols[[i]] 
    df2 <- mybiglist11cols[[j]] 
    results[i, j] <- list(summary(protest(df1[, -1], df2[, -1], permutations=10000))) 
    }}

來源

2013-05-29 22:07:11

Jean，非常感謝您提供的代碼。我測試了它，並且在代碼的第七行添加了右括號後，它可以工作。 –

非常好。我添加了正確的paren。 –

Jean，您的代碼可以在當時處理兩個文件（矩陣）。我擴展了它，它適用於28個矩陣。因此，在抗議分析之後，產生的28 * 28矩陣在各對數據幀之間填充有平方和（ss）。這裏出現了一個新問題。如何評估比較的某個ss值和一對矩陣之間的對應關係（匹配）？ –

在多個數據幀上循環並存儲結果

回答

相關問題