如何通過biopython從gi編號獲取序列描述？

我有一個GI（genbank標識符）號碼列表。我如何獲得每個GI號碼的序列描述（如'mus musculus hypothetical protein X'），以便我可以將它存儲在一個變量中並將其寫入文件？感謝您的幫助！如何通過biopython從gi編號獲取序列描述？

來源

2016-02-25 sequence_hard

檢查了這一點，讓你開始：http://biopython.org/DIST/docs/api/Bio.Entrez-pysrc.html – heathobrien

因此，如果任何人有這樣的疑問，這裏是解決方案：

handle=Entrez.esummary(db="nucleotide, protein, ...", id="gi or NCBI_ref number") 
record=Entrez.read(handle) 
handle.close() 
description=record[0]["Title"] 
print description

這將打印對應的標識序列描述。

來源

2016-02-26 10:34:06

這是我寫的一個腳本，用於爲文件中的每個genbank標識符提取整個GenBank文件。應該很容易爲您的應用程序進行更改。

#This program will open a file containing NCBI sequence indentifiers, find the associated 
#information and write the data to *.gb 

import os 
import sys 
from Bio import Entrez 
Entrez.email = "yo[email protected]" #Always tell NCBI who you are 

try:        #checks to make sure input file is in the folder 
    name = raw_input("\nEnter file name with sequence identifications only: ") 
    handle = open(name, 'r') 
except: 
    print "File does not exist in folder! Check file name and extension." 
    quit() 

outfile = os.path.splitext(name)[0]+"_GB_Full.gb" 
totalhand = open(outfile, 'w') 

for line in handle: 
    line = line.rstrip()    #strips \n from file 
    print line 
    fetch_handle = Entrez.efetch(db="nucleotide", rettype="gb", retmode="text", id=line) 
    data = fetch_handle.read() 
    fetch_handle.close() 
    totalhand.write(data)

來源

2016-02-26 18:19:51 Damian

如何通過biopython從gi編號獲取序列描述？

回答

相關問題