2017-05-24 33 views
0
myfunction3 <- function(seq2,z) 


for(j in 1:100) 

{ 

if(z[j]>0.7) 

{ 
if(seq2[j] =='A') replace(seq2,j,sample(c("C","G","T"),1)) 

else if(seq2[j] =='G') replace(seq2,j,sample(c("C","A","T"),1)) 

else if(seq2[j] =='T') replace(seq2,j,sample(c("C","G","A"),1)) 

else if(seq2[j] =='C') replace(seq2,j,sample(c("A","G","T"),1)) 

else if(seq2[j]=='E') replace(seq2,j,'T') 

} 

} 

return(seq2) 

根據概率向量z,其中,如果該概率大於0.7,則新的序列可具有任何其它三種核苷酸的模擬給定的DNA序列SEQ2 (A,G,T,C)。但每次它都返回一個NULL向量。模擬中的R用我寫了這個功能概率

+1

你需要周圍定義您的功能...'功能(SEQ2表達一些大括號, z){... ... return(seq2)}' –

+0

如果seq2是單個字符串,則seq2 [j]是NA。 –

+0

另外,我不確定'替換'是做這件事的正確方法。對每個語句使用'seq2 [j] < - sample(c(...),1)''。 –

回答

1

這是你的功能的緊湊型變種:

myfunction3 <- function(seq2,z) { 
    for(j in which(z>0.7)) 
    seq2[j] <- switch(seq2[j], 
         A=sample(c("C","G","T"),1), 
         G=sample(c("C","A","T"),1), 
         T=sample(c("C","G","A"),1), 
         C=sample(c("A","G","T"),1), 
         E="T" 
    ) 
    return(seq2) 
} 

這裏是它如何工作的:

set.seed(42) 
z <- sample(1:10)/10 
seq <- sample(c("A","G","T", "C"), 10, repl=TRUE) 
data.frame(seq, z, seq2=myfunction3(seq,z)) 
# seq z seq2 
# 1 G 1.0 T 
# 2 T 0.9 C 
# 3 C 0.3 C 
# 4 G 0.6 G 
# 5 G 0.4 G 
# 6 C 0.8 T 
# 7 C 0.5 C 
# 8 A 0.1 A 
# 9 G 0.2 G 
# 10 T 0.7 T 

測試最後一個條件(E = 「T」):

set.seed(42) 
z <- sample(3:17)/10 
seq <- sample(c("A","G","T", "C", "E"), length(z), repl=TRUE) 
data.frame(seq, z, seq2=myfunction3(seq,z)) 
1

我假設seq2是一個字符向量,而z是樣本leng的向量日和要突變的位置在seq2其中z > 0.7

一種方式做到這一點是首先創建有效的換人名單,由核苷酸鍵,然後寫一個突變的功能,然後sapply,其功能是子矢量的seq2其中z > 0.7

substitutions <- list(A = c("C","G","T"), 
        G = c("A","C","T"), 
        T = c("A","C","G"), 
        C = c("A","G","T"), 
        E = c("T")) 

mutate <- function(nucleotide){ 
    sample(substitutions[[nucleotide]],1) 
} 

myfunc <- function(seq2,z){ 
    to.change <- which(z > 0.7) 
    seq2[to.change] <- sapply(seq2[to.change],mutate) 
    seq2 
} 

例如:

> s <- sample(c("A","T","G","C","E"),10, replace = T) 
> z <- sample(c(0,0.8),10, replace = T) 
> rbind(s,z,myfunc(s,z)) 
    [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] 
s "E" "A" "C" "G" "E" "C" "E" "T" "E" "A" 
z "0.8" "0" "0" "0.8" "0" "0.8" "0.8" "0.8" "0" "0.8" 
    "T" "A" "C" "C" "E" "A" "T" "G" "E" "T"