1
我試圖訪問NCBI SRA數據庫,查詢它的ID列表並將輸出保存到矩陣。循環保存輸出到矩陣
我正在使用Bioconductor的sradb軟件包來做到這一點,現在我可以訪問和查詢數據庫,但它真的很慢,我不知道如何保存循環輸出。
文件GPL11154_GSMs.txt包含我感興趣的ID和它看起來像這樣:
GSM616127
GSM616128
GSM616129
GSM663427
GSM665037
我現在已經更新在每個迭代上的結果。
#source("https://bioconductor.org/biocLite.R")
#biocLite("SRAdb")
library(SRAdb)
#connect to databasse
sqlfile <- getSRAdbFile()
sra_con <- dbConnect(SQLite(),sqlfile)
## lists all the tables in the SQLite database
sra_tables <- dbListTables(sra_con)
sra_tables
dbGetQuery(sra_con,'PRAGMA TABLE_INFO(study)')
## checking the structure of the tables
#dbListFields(sra_con,"experiment")
#dbListFields(sra_con,"run")
#read in file with sample IDs per platform
x <- scan("GPL11154_GSMs.txt", what="", sep="\n")
gsm_list <- strsplit(x, "[[:space:]]+") # Separate elements by one or more whitepace
for (gsm in gsm_list){
gsm_to_srr <- getSRA(search_terms = gsm, out_types = c("submission", "study", "sample","experiment", "run"), sra_con)
print(gsm_to_srr)
}
是的,我轉換res.df = as.data.frame(do.call(rbind,res)),然後保存。它工作完美。謝謝 – MenieM