用for循環計算核苷酸的Python

我試圖從輸入文件中提取DNA序列並使用一個循環來計算它們的個數A的T的C和G的數目，並且如果有非「ATCG」字母I需要打印「錯誤」比如我輸入文件是：用for循環計算核苷酸的Python

SEQ1 AAAGCGT SEQ2 AA tGcGt牛逼 SEQ3 AF GTGA CCTG

的代碼我想出是：

acount = 0 
ccount = 0 
gcount = 0 
tcount = 0 
for line in input: 
     line=line.strip('\n') 
     if line[0] == ">": 
       print line + "\n" 
       output.write(line+"\n") 
     else: 
       line=line.upper() 
       list=line.split() 
       for list in line: 

         if list == "A": 
           acount = acount + 
           #print acount 
         elif list == "C": 
           ccount = ccount + 
           #print ccount 

         elif list == "T": 
           tcount = tcount + 
           #print tcount 
         elif list == "G": 
           gcount=gcount +1 
           #print gcount 
         elif list != 'A'or 'T' or 'G' or 'C': 
           break

所以我需要每行的總數，但我的代碼給了我的整個文件的A等T的總和。我希望我的輸出是這樣的

SEQ1：共有A的：3 總碳的：等等每個序列。

關於我能做些什麼來修復我的代碼來實現它的任何想法？

來源

2013-04-01 user2097877

在每個for循環迭代開始時重置'acount'。 – Blender

我會建議沿着這些路線的東西：

import re 

def countNucleotides(filePath): 
    aCount = [] 
    gCount = [] 
    cCount = [] 
    tCount = [] 
    with open(filePath, 'rb') as data: 
     for line in data: 
      if not re.match(r'[agctAGCT]+',line): 
       break 
      aCount.append(notCount(line,'a')) 
      gCount.append(notCount(line,'g')) 
      cCount.append(notCount(line,'c')) 
      tCount.append(notCount(line,'t')) 

def notCount(line, character): 
    appearances = 0 
    for item in line: 
     if item == character: 
      appearances += 1 
    return appearances

但是，您可以打印出來你之後想。

來源

2013-04-01 04:21:55

我喜歡你在這裏@Slater Tyranus唯一的問題是（如果你不能說）它是在學校的任務，如果我使用.count函數，我會得到停靠點。 – user2097877

如果問題是家庭作業，請使用作業標籤。堆棧溢出不是真正的功課，但我會更新問題，不使用計數功能。 –

用for循環計算核苷酸的Python

回答

相關問題