3
我有製表符分隔的數據集,所以我想下面的數據集轉換成一個矩陣將字符串轉換數據集的矩陣
CATGGGGAAAACTGA
CCTCTCGATCACCGA
CCTATAGATCACCGA
CCGATTGATCACCGA
CCTTGTGCAGACCGA
我用
rbind(strsplit("CATGGGGAAAACTGA","")[[1]],
strsplit("CCTCTCGATCACCGA","")[[1]],
strsplit("CCTCTCGATCACCGA","")[[1]],
strsplit("CCTATAGATCACCGA","")[[1]],
strsplit("CCGATTGATCACCGA","")[[1]],
strsplit("CCTTGTGCAGACCGA","")[[1]])
並且這產生:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
[1,] "C" "A" "T" "G" "G" "G" "G" "A" "A" "A" "A" "C" "T" "G" "A"
[2,] "C" "C" "T" "C" "T" "C" "G" "A" "T" "C" "A" "C" "C" "G" "A"
[3,] "C" "C" "T" "C" "T" "C" "G" "A" "T" "C" "A" "C" "C" "G" "A"
[4,] "C" "C" "T" "A" "T" "A" "G" "A" "T" "C" "A" "C" "C" "G" "A"
[5,] "C" "C" "G" "A" "T" "T" "G" "A" "T" "C" "A" "C" "C" "G" "A"
[6,] "C" "C" "T" "T" "G" "T" "G" "C" "A" "G" "A" "C" "C" "G" "A"
但是,當數據集非常大時,這個過程很累人。我怎麼能自動做到這一點?
使用'do.call':類似'do.call(「rbind」,lapply(myDNAVec,strsplit,split =「」))''。 – lmo
序列長度是否固定,始終爲15? – zx8754
@lmo不需要'lapply'。 'strsplit(myDNAvec,split ='')'會起作用。 –