2017-01-25 39 views
2
from Bio import SeqIO 
import re, os 
import pandas as pd 
from Bio.Seq import Seq 
from Bio.Alphabet import generic_dna 
from Bio.SeqRecord import SeqRecord 
os.chdir('c:\Users\Workspace\Desktop') 


filename = os.path.join(os.getcwd(),'convertedgisaid','df.dat') 
df = pd.read_table(filename, header=None, sep=' ',low_memory=False) 
df.columns = ['GID','IsolateID','Carrier','Country','HN','Type','Date','Segment','Gene','Length','ETC','SEQ'] 

f_in = os.path.join(os.getcwd(),'convertedgisaid','annotationFULL.tbl') 
f_out = os.path.join(os.getcwd(),'convertedgisaid','gisaid_influenza.cds') 
file = open(f_in,'r') 
records = file.read().split('>Feature ') 
file.close() 
records = records[1:] 
f = open(f_out,'w') 
start=1 
end=0 
for rec in records: 
withoutNewline = re.sub("\n"," ",rec) 
GID = re.match('\d{1,6}',withoutNewline).group() 
Details = df[df.GID==GID] 
Seq = list(Details.SEQ)[0] 
codingSeq='' 
codingDetails = '' 
cdsSegment = re.findall("((?:\d{1,4} |<\d{1,4} >|\d{1,4} >)\d{1,4} CDS)",withoutNewline) 
for cds in cdsSegment: 
    cdsSplit = cds.split(' ') 
    if(cdsSplit[0][0]=="<" or cdsSplit[1][0]==">"): 
    if(cdsSplit[0][0]=="<"): 
    start = cdsSplit[0][1:] 
    else: 
    start = cdsSplit[0] 
    if(cdsSplit[1][0]==">"): 
    end = cdsSplit[1][1:] 
    else: 
    end = cdsSplit[1] 
    else: 
    start = cdsSplit[0] 
    end = cdsSplit[1] 
    codingDetails+=cdsSplit[0]+'-'+cdsSplit[1]+',' 
    codingSeq+=Seq[(int(start)-1):int(end)] 
codingDetails = codingDetails[:-1] 
curSeq = codingSeq.upper() 
curId = GID 
curDesc = ":"+codingDetails+"Influenza "+list(Details.Type)[0]+" virus ("+list(Details.ETC)[0]+" (" +list(Details.HN)[0]+"))" 
cdsRecords = SeqRecord(Seq(curSeq, generic_dna), id=curId, description=curDesc) 
SeqIO.write(cdsRecord,f,"fasta") 
f.close() 

下面的代碼顯示了以下錯誤:Biopython無法申報新SeqRecord

Traceback (most recent call last): File "", line 1, in TypeError: 'str' object is not callable Topic:Biopython unable to declare new SeqRecord

可我知道什麼是錯的呢?我正在使用生物Python。

+0

'從生物進口SeqIO'而不是'從生物進口SeqIO \ n'的作品? –

+0

這是我的遺憾,抱歉哈哈! – TJA

回答

3

當在Stackoverflow上尋求幫助時,總是嘗試將問題減少到Minimal, Complete, and Verifiable example

如果你這樣做,你會看到你有以下的進口:

from Bio.Seq import Seq 

,但在你的程序裏,你定義一個變量Seq因爲如此如下

Seq = list(Details.SEQ)[0] 

,現在Seq不再是Bio.Seq的函數,而是一個字符串。

因此,當您嘗試執行SeqRecord(Seq(curSeq, generic_dna), id=curId, description=curDesc)時,您將得到TypeError,因爲Seq不再可調用。

因此,解決方案是將您的Seq變量重命名爲其他內容,以便它不再影響導入的Bio.Seq.Seq

+0

謝謝!我設法解決它與您的解決方案:) – TJA

+0

不客氣。如果它幫助你,請不要忘記[接受答案](http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work)。 – BioGeek