我正在使用BioPython MuscleCommanLine在子進程中對齊序列。肌肉的輸入和輸出是stdin和stdout。這工作,但只要popen稱肌肉,我從屏幕上的肌肉得到一個程序總結。這大大減緩了程序的速度,因爲有數百萬次對子進程的調用。biopython MuscleCommandLine
mcline = MuscleCommandline()
read_list = (SeqRecord(Seq(seq, IUPAC.unambiguous_dna), str(index)) for index, seq in enumerate(grouped_reads_list))
muscle = Popen(str(mcline), stdin=PIPE, stdout=PIPE, universal_newlines=True)
SeqIO.write(read_list, muscle.stdin, "fasta") # Send sequences to Muscle in FASTA format.
muscle.stdin.close()
align = AlignIO.read(muscle.stdout, "fasta") # Capture output from muscle and get it into FASTA format in an object.
print(align)
muscle.stdout.close()
exit("Testin Testing")
consensus_read = AlignInfo.SummaryInfo(align).dumb_consensus(threshold=0.6, ambiguous="N", consensus_alpha=IUPAC.ambiguous_dna) # Create consensus from alignment object.
屏幕輸出是由Robert C.埃德加
http://www.drive5.com/muscle 該軟件
MUSCLE v3.8.31捐贈給公共領域。 請引用:Edgar,R.C. Nucleic Acids Res 32(5),1792-97。
- 2個seqs,最大長度爲133,平均長度133 00:00:00 10 MB(-1%)Iter項目1 100.00%K-mer的DIST通1 00:00:00 10 MB(-1 %)Iter 1 100.00%K-mer dist pass 2 00:00:00 12 MB(-1%)Iter 1 100.00%Align node
00:00:00 12 MB(-1%)Iter 1 100.00%Root對準 - 6個seqs,最大長度爲133,平均長度133 SingleLetterAlphabet()2行133列 對準