2016-09-28 117 views
-1

如果我有更長的字符串,如何計算在該字符串中查找給定長度的字的概率?查找字符串中字的概率

到目前爲止,我有這樣的:

import math 
from scipy import stats 

alphabet = list("ATCG") # This is the alphabet I am working with 
string = "AATCAGTAGATCG" # Here are two example strings 
string2 = "TGTAAACCTTGGTTTATCG" 
word = "ATCG" # This is my word 

n_substrings = len(string) - len(word) # The number of possible substrings 
n_substrings2 = len(string2) - len(word) 

prob_match = math.pow(len(alphabet), - len(word)) # The probability of randomly choosing the word from the alphabet 

# Get the probability from a binomial test? 
print stats.binom_test(1, n_substrings, p=prob_match) # (Number of successes, number of trials, prob of success) 
print stats.binom_test(1, n_substrings2, p=prob_match) 

>>>0.0346119111615 
    0.0570183821615 

這是一個合適的方式來做到這一點還是我失去了一些東西?

+0

爲什麼向下票嗎? – kezzos

回答

1

我認爲你應該做的:

n_substrings = len(string) - len(word) +1 

在5字符串,以4字母串,你有兩個選擇: ATCGA可容納ATCG和TCGA

+0

是的,謝謝你是一個基本的錯誤。 – kezzos