您將不得不在查詢實際運行之前運行tabix(bgzip,index)的先決條件。它們不包含在tb.query中。
如果你的文件已經拉上,你應該那麼做:
zippedf ='qcat.gz'
def tabix_index(zippedf)
from subprocess import Popen,PIPE
import shlex
p = Popen(['tabix','-f', zippedf], stdout= PIPE)
# or : cmd = "tabix -f " + zippedf
# p = Popen(shlex.split(cmd), stdout=PIPE)
#(shlex splits the cmd in spaces)
p.wait()
如果你有,你可以連續運行3子過程的非壓縮文件做整理,bgzip和索引:
out_sorted = 'myfile.sorted'
out_zipped= out_sorted + ".gz"
with open(out_zipped,'w') as sort_zip_out :
cmd="sort -V -k1,1 myfile"
p1 = Popen(shlex.split(cmd), stdout=PIPE)
p2 = Popen(['bgzip','-c','-f'], stdin=p1.stdout, stdout= sort_zip_out)
p1.stdout.close() #finish first subprocess before starting second
p1.wait() #wait for results to be written
#when these two subprocesses are finished,
tabix_index(out_zipped)