2014-09-22 93 views
0

此代碼寫作列表CSV

for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein): 
    record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id) 

    if (not reference_sequence): 
     reference_sequence = record.seq 
     reference_name  = record_id 
     #continue 
    print ",".join([reference_name, record_id, compare_seqs(reference_sequence, record.seq)]) 

使端子輸出,看起來像

7065_8#1,8987_2#53, 
7065_8#1,8987_2#58, 
7065_8#1,8987_2#61, 
7065_8#1,8987_2#62,E-G [246] 
7065_8#1,8987_2#65,N-K [71],Y-D [223] 

我想通過線來寫這條線到CSV,有什麼建議?在嵌套列表

for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein): 
    record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id) 

if (not reference_sequence): 
    reference_sequence = record.seq 
    reference_name  = record_id 
    #continue 
line= ",".join([reference_name, record_id, compare_seqs(reference_sequence, record.seq)]) 
with open(csvfile, "w") as output: 
    writer = csv.writer(output, lineterminator='\n') 
    writer.writerow([line]) 

回答

1

包中的所有記錄(即代替print ','.join(...)你做records.append([...])),然後就可以使用writerows(records),並寫入到文件:

+0

也太棒了!謝謝 – user3234810 2014-09-22 14:16:15

1

可以SUSE writerow與以下保存輸出。不需要像'.'.join()這樣的東西,這是csv爲你做的。

爲了完整起見:

records = [] 
for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein): 
    record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id) 

    if (not reference_sequence): 
     reference_sequence = record.seq 
     reference_name  = record_id 
     #continue 
    records.append([reference_name, record_id, compare_seqs(reference_sequence, record.seq)]) 

with csv.writer(open('file.csv', 'w')) as fp: 
    fp.writerows(records) # note that it's not writerow but writerows which allows you to write muptiple rows 
+0

上面編輯更好的問題! – user3234810 2014-09-22 14:05:02

+0

你真棒!謝謝你! – user3234810 2014-09-22 14:09:40

+0

不客氣!現在你可以通過接受答案告訴社區! – Kasramvd 2014-09-22 14:12:44

1

您也可以直接寫逗號分隔字符串(與quotechar一起)的文件:

f = open("output.csv","w") 
for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein): 
    record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id) 

    if (not reference_sequence): 
    reference_sequence = record.seq 
    reference_name  = record_id 
    #continue 
    csvrow = '","'.join([reference_name, record_id, compare_seqs(reference_sequence, record.seq)]) 
    csvrow = '"'+csvrow+'"' 
    print >>f, csvrow 
f.close() 

使用這種方法,你可以打開文件並檢查數據是否正在寫入,即使腳本正在運行。