在R循環中創建數據框並命名它

-1

我正在處理5個我想要過濾的數據框（如果它們與正則表達式匹配，則刪除一些行）。因爲所有的數據框都是相似的，使用相同的變量名，我將它們存儲在一個列表中，並且正在迭代它。但是，當我想保存每個原始數據框的過濾數據時，我發現它會創建一個i_filtered（而不是dfName_filtered），因此每次循環運行時都會被覆蓋。下面是我在循環：在R循環中創建數據框並命名它

for (i in list_all){ 
    i_filtered1 <- i[i$chr != filter1,] 
    i_filtered2 <- i[i$chr != filter2,] 
    #Write the result filtered table in a csv file 
    #Change output directory if needed 
    write.csv(i_filtered2, file="/home/tama/Desktop/i_filtered.csv") 
}

正如我所說的，過濾器1和過濾器2只是正則表達式，我使用的CHR列來過濾數據。將原始名稱+「_filtered」分配給新數據框的正確方法是什麼？

在此先感謝

編輯補充信息：每個數據幀都有這些變量（但值可以更改）

chr  start end length 
chr1 10400 10669 270 
chr10 237646 237836 191 
chrX 713884 714414 531 
chrUn 713884 714414 531 
chr1 762664 763174 511 
chr4 805008 805571 564

我儲存了所有他們的列表：

list_all <- list(heep, oe, st20_n, st20_t,all) 
list_all <- lapply(list_all, na.omit)

該過濾器：

#Get rid of random chromosomes 
filter1=".*random" 
#Get rid of undefined chromosomes 
filter2 = "ĉhrUn.*

我在尋找的輸出是：

heep_filtered1 
heep_filtered2 
oe_filtered1 
oe_filtered2 
etc

來源

2016-07-06 Tamara Dominguez Poncelas

添加一個最小可重現的例子。 – Alex

[使用數據幀列表]（http://stackoverflow.com/a/24376207/903061）。 – Gregor

@Alex，增加了更多信息。 –

一種可能性是迭代指數（或名稱）的序列，而不是在數據幀本身的列表，並訪問數據 - 使用索引的幀。

另一個問題是!=運算符不支持正則表達式。它只是確切的文字匹配。您需要改用grepl()。

names(list_all) <- c("heep", "oe", "st20_n", "st20_t", "all") 

filtered <- NULL 
for (i in names(list_all)){ 
    df <- list_all[[i]] 
    df.1 <- df[!grepl(filter1, df$chr), ] 
    df.2 <- df[!grepl(filter2, df$chr), ] 
    #Write the result filtered table in a csv file 
    #Change output directory if needed 
    write.csv(df.2, file=paste0("/home/tama/Desktop/", i, "_filtered.csv")) 
    filtered[[paste0(i, "_filtered", 1)]] <- df.1 
    filtered[[paste0(i, "_filtered", 2)]] <- df.2 
}

結果是一個名爲filtered的列表，其中包含已過濾的數據幀。

來源

2016-07-06 21:19:56

感謝您的回覆，我添加了更多信息。 –

@TamaraDominguezPoncelas我擴大了答案。 –

謝謝你的Ernest。我現在必須修改正則表達式，因爲他們沒有做我想要的東西，但是這段代碼確實可以迭代並創建新的df和文件。 –

問題在於i僅在單獨使用時才被解釋。您正在將其用作其他名稱的一部分，並作爲當前版本中的字符。

我會建議命名列表，然後使用lapply而不是for循環（注意，我也改變了過濾器在一個步驟中發生，因爲現在不清楚你是否試圖將兩件事都拿出來 - 這也使得更容易添加更多的過濾器）。

filters <- c(".*random", "chrUn.*") 
list_all <- list(heep = heep 
       , oe = oe 
       , st20_n = st20_n 
       , st20_t = st20_t 
       , all = all) 
toLoop <- names(list_all) 
names(toLoop) <- toLoop # renames them in the output list 


filtered <- lapply(toLoop, function(thisSet)){ 
    tempFiltered <- list_all[[thisSet]][!(list_all[[thisSet]]$chr %in% filters),] 
    #Write the result filtered table in a csv file 
    #Change output directory if needed 
    write.csv(tempFiltered, file=paste0("/home/tama/Desktop/",thisSet,"_filtered.csv")) 

    # Return the part you care about 
    return(tempFiltered) 
}

來源

2016-07-06 21:39:21

感謝馬克，我試過這個，但由於某種原因，它不保存新的csv文件。 –

在R循環中創建數據框並命名它

回答

相關問題