2014-03-25 92 views
0

我剛剛開始使用Python和BioPython,沒有太多的編程經驗。我會很感激你們可以給我的任何幫助。SeqIO:「在句柄中找不到記錄」

我試圖從genbank中提取CDS和/或rRNA序列。重要的是我只能獲得開放閱讀框架,這就是爲什麼我不只是拉動整個序列。當我運行下面的代碼,它踢回一個錯誤說:record = SeqIO.read(handle, "genbank")

手柄

發現的代碼行讀取任何記錄。我不知道如何解決這個問題。我已經包含了我在下面使用的代碼。另外,如果有更簡單的方法來執行此操作或已發佈的代碼,如果你們讓我知道,我將不勝感激。

謝謝!

# search sequences by a combination of keywords 
# need to find (number of) results to set 'retmax' value 
handle = Entrez.esearch(db = searchdb, term = searchterm) 
records = Entrez.read(handle) 
handle.close() 
# repeat search with appropriate 'retmax' value 
all_handle = Entrez.esearch(db = searchdb, term = searchterm, retmax = records['Count']) 
records = Entrez.read(all_handle) 

print " " 
print "Number of sequences found:", records['Count'] #printing to make sure that code is working thus far. 
print " " 

locations = [] # store locations of target sequences 
sequences = [] # store target sequences 

for i in range(0,int(records['Count'])) : 
    handle = Entrez.efetch(db = searchdb, id = records['IdList'][i], rettype = "gb", retmode = "xml") 
    record = SeqIO.read(handle, "genbank") 
    for feature in record.features: 
     if feature.type==searchfeaturetype: #searches features for proper feature type 
      if searchgeneproduct in feature.qualifiers['product'][0]: #searches features for proper gene product 
       if str(feature.qualifiers) not in locations: # no repeat location entries 
        locations.append(str(feature.location)) # appends location entry 
        sequences.append(feature.extract(record.seq)) # append sequence 
+0

謝謝GWW。這解決了我的問題! – jrp355

回答

1

您請求從GenBank中xmlSeqIO.read預計格式是基因庫平面文件格式。嘗試將您的efetch行更改爲:

handle = Entrez.efetch(db = searchdb, id = records['IdList'][i], rettype = "gb", retmode = "txt")