2014-09-19 65 views
5

我對這個python代碼有點麻煩。使用Python生成所有DNA kmers

我對Python很陌生,我想生成長度爲k的所有可能的DNA克默斯,並將它們添加到列表中,但我想不出一種優雅的方式來做到這一點!以下是我對一個長度爲8的kmer的建議。任何建議都會非常有幫助。

bases=['A','T','G','C'] 
kmer=list() 

for i in bases: 
    for j in bases: 
     for k in bases: 
      for l in bases: 
       for m in bases: 
        for n in bases: 
         for o in bases: 
          for p in bases: 
           kmer.append(i+j+k+l+m+n+o+p) 
+1

'列表(itertools.product(鹼,重複= k))的' – inspectorG4dget 2014-09-19 21:24:37

回答

13
In [58]: bases=['A','T','G','C'] 

In [59]: k = 2 

In [60]: [''.join(p) for p in itertools.product(bases, repeat=k)] 
Out[60]: ['AA', 'AT', 'AG', 'AC', 'TA', 'TT', 'TG', 'TC', 'GA', 'GT', 'GG', 'GC', 'CA', 'CT', 'CG', 'CC'] 

In [61]: k = 3 

In [62]: [''.join(p) for p in itertools.product(bases, repeat=k)] 
Out[62]: ['AAA', 'AAT', 'AAG', 'AAC', 'ATA', 'ATT', 'ATG', 'ATC', 'AGA', 'AGT', 'AGG', 'AGC', 'ACA', 'ACT', 'ACG', 'ACC', 'TAA', 'TAT', 'TAG', 'TAC', 'TTA', 'TTT', 'TTG', 'TTC', 'TGA', 'TGT', 'TGG', 'TGC', 'TCA', 'TCT', 'TCG', 'TCC', 'GAA', 'GAT', 'GAG', 'GAC', 'GTA', 'GTT', 'GTG', 'GTC', 'GGA', 'GGT', 'GGG', 'GGC', 'GCA', 'GCT', 'GCG', 'GCC', 'CAA', 'CAT', 'CAG', 'CAC', 'CTA', 'CTT', 'CTG', 'CTC', 'CGA', 'CGT', 'CGG', 'CGC', 'CCA', 'CCT', 'CCG', 'CCC']