2016-03-28 22 views
0

我想篩選出列,其中rownames的第四部分($V4)與u-篩選出空間separeted行名

> head(miraligner_filterCRC_filterMM) 
                 100G 100R 106G 106R 122G 122R 124G 124R 126G 126R 134G 134R 
hsa-miR-1296-5p TTAGGGCCCTGGCTCCATCT 0 0 0 u-CC   23 17 11 21 29 14 16 20 11 1 37 13 
hsa-miR-887-3p GTGAACGGGCGCCATCCCGAGGCTT 0 0 0 d-CTT  3 8 0 4 4 3 2 12 12 3 4 8 
hsa-miR-454-3p TAGTGCAATATTGCTTATAGGGTAT 0 u-AT 0 0  2 1 1 0 0 0 0 1 2 1 8 2 
hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTG 0 u-TG 0 d-C 2 1 0 0 0 0 0 1 0 2 2 2 
hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTA 0 u-TA 0 d-C 6 6 5 0 3 3 1 4 1 1 8 5 
hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTT 0 u-TT 0 d-C 22 41 12 26 25 51 2 25 2 24 91 51 

我嘗試

table(read.table(text=rownames(miraligner_filterCRC_filterMM))$V4=="u-" 

回答

2

開始數據

x <- c("hsa-miR-1296-5p TTAGGGCCCTGGCTCCATCT 0 0 0 u-CC  ", 
     "hsa-miR-887-3p GTGAACGGGCGCCATCCCGAGGCTT 0 0 0 d-CTT ", 
     "hsa-miR-454-3p TAGTGCAATATTGCTTATAGGGTAT 0 u-AT 0 0 ", 
     "hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTG 0 u-TG 0 d-C", 
     "hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTA 0 u-TA 0 d-C", 
     "hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTT 0 u-TT 0 d-C") 

在由空格分隔的3組非空格字符後匹配「u-」:

grep("^(?:[^ ]+){3}u-",x,value=TRUE) 

# [1] "hsa-miR-454-3p TAGTGCAATATTGCTTATAGGGTAT 0 u-AT 0 0 " 
# [2] "hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTG 0 u-TG 0 d-C" 
# [3] "hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTA 0 u-TA 0 d-C" 
# [4] "hsa-miR-200b-3p TAATACTGCCTGGTAATGATGACTT 0 u-TT 0 d-C" 
+0

如果要過濾與「u-A」完全匹配的內容,該怎麼辦? – user2300940

+0

@ user2300940你在正則表達式的結尾看到'u-'?將其更改爲「u-A」。 – Gregor

+0

這也將過濾u-AA – user2300940

1
columnSplitted<- strsplit(miraligner_filterCRC_filterMM$V4,'-') 
part1<- unlist(columnSplitted)[2*(1:nrow(miraligner_filterCRC_filterMM))-1] 
miraligner_filterCRC_filterMM[part1!="u",]