2013-07-18 77 views
0

我已經在python中編寫了一個腳本,它在單個文件上工作。我無法找到答案讓它在多個文件上運行,併爲每個文件單獨提供輸出。python多輸入和多輸出

out = open('/home/directory/a.out','w') 
infile = open('/home/directory/a.sam','r') 

for line in infile: 
    if not line.startswith('@'): 
     samlist = line.strip().split() 
     if 'I' or 'D' in samlist[5]: 
      match = re.findall(r'(\d+)I', samlist[5]) # remember to chang I and D here aswell 
      intlist = [int(x) for x in match] 
##   if len(intlist) < 10: 
      for indel in intlist: 
       if indel >= 10: 
##     print indel 
      ###intlist contains lengths of insertions in for each read 
      #print intlist 
        read_aln_start = int(samlist[3]) 
        indel_positions = [] 
        for num1, i_or_d, num2, m in re.findall('(\d+)([ID])(\d+)?([A-Za-z])?', samlist[5]): 
         if num1: 
          read_aln_start += int(num1) 
         if num2: 
          read_aln_start += int(num2) 
         indel_positions.append(read_aln_start) 
       #print indel_positions 
        out.write(str(read_aln_start)+'\t'+str(i_or_d) + '\t'+str(samlist[2])+ '\t' + str(indel) +'\n') 
out.close() 

我想我的腳本拍攝多個文件與像a.sam,b.sam,c.sam名和每個文件給我的輸出:aout.sam,bout.sam,cout.sam

請問您可以通過我的解決方案或提示。

問候, Irek

+1

您是否嘗試過在功能中包裝該腳本並將輸入和輸出文件的名稱作爲參數傳遞? –

+3

'如果'我'或'D'在samlist [5]'沒有做你認爲它做的事情。這種情況總是如此。 –

+0

我不認爲這總是如此。只有一些行包含I或D.其中大部分實際上沒有任何這些字母,因此條件是錯誤的。 – Irek

回答

1

我建議包裝該腳本的功能,採用def關鍵字,並通過輸入和輸出文件名作爲參數傳遞給該功能。

def do_stuff_with_files(infile, outfile): 
    out = open(infile,'w') 
    infile = open(outfile,'r') 
    # the rest of your script 

現在你可以調用這個函數來輸入和輸出文件名的任意組合。

do_stuff_with_files('/home/directory/a.sam', '/home/directory/a.out') 

如果你想在某個目錄做到這一點的所有文件,使用glob庫。要生成輸出文件名,只需將最後三個字符(「sam」)替換爲「out」。

import glob 
indir, outdir = '/home/directory/', '/home/directory/out/' 
files = glob.glob1(indir, '*.sam') 
infiles = [indir + f    for f in files] 
outfiles = [outdir + f[:-3] + "out" for f in files] 
for infile, outfile in zip(infiles, outfiles): 
    do_stuff_with_files(infile, outfile) 
+0

'glob.glob('/ home/directory/*。out')'不起作用,因爲您必須在運行腳本之前創建輸出文件。 – falsetru

+0

@falsetru是的,也意識到這一點。借用你的方法。 ;-) –

+0

@tobias_k如果我還想在不同的目錄中創建我的outfiles? – Irek

4

遍歷文件名。

input_filenames = ['a.sam', 'b.sam', 'c.sam'] 
output_filenames = ['aout.sam', 'bout.sam', 'cout.sam'] 
for infn, outfn in zip(input_filenames, output_filenames): 
    out = open('/home/directory/{}'.format(outfn), 'w') 
    infile = open('/home/directory/{}'.format(infn), 'r') 
    ... 

UPDATE

下面的代碼生成給出input_filenames output_filenames。

import os 

def get_output_filename(fn): 
    filename, ext = os.path.splitext(fn) 
    return filename + 'out' + ext 

input_filenames = ['a.sam', 'b.sam', 'c.sam'] # or glob.glob('*.sam') 
output_filenames = map(get_output_filename, input_filenames) 
+0

不完全是我在找的。我仍然需要編寫所有的文件名。這很酷,直到目錄中有100個文件 – Irek

+0

@Irek,增加了另一個代碼,可以從input_filenames生成output_filenames。 – falsetru

+0

好極了。 是否可以生成文件輸入名稱? – Irek

1

以下腳本允許使用輸入和輸出文件。它將使用「.sam」擴展名遍歷給定目錄中的所有文件,對它們執行指定的操作,並將結果輸出到單獨的文件。

Import os 
# Define the directory containing the files you are working with 
path = '/home/directory' 
# Get all the files in that directory with the desired 
# extension (in this case ".sam") 
files = [f for f in os.listdir(path) if f.endswith('.sam')] 
# Loop over the files with that extension 
for file in files: 
    # Open the input file 
    with open(path + '/' + file, 'r') as infile: 
     # Open the output file 
     with open(path + '/' + file.split('.')[0] + 'out.' + 
           file.split('.')[1], 'a') as outfile: 
      # Loop over the lines in the input file 
      for line in infile: 
       # If a line in the input file can be characterized in a 
       # certain way, write a different line to the output file. 
       # Otherwise write the original line (from the input file) 
       # to the output file 
       if line.startswith('Something'): 
        outfile.write('A different kind of something') 
       else: 
        outfile.write(line) 
    # Note the absence of either a infile.close() or an outfile.close() 
    # statement. The with-statement handles that for you