2015-04-04 22 views
0

我想在python的多個文件列中輸出輸出。我的代碼生成兩行輸出。代碼是如何在python中的多個coloumns上編寫輸出

f2 = open("C:/Python26/Semantics.txt",'w') 
sem = ["cells", "gene","factor","alpha", "receptor", "t","promoter"] 
with open("C:/Python26/trigram.txt") as f : 
for x in f: 
    x = x.strip().split("$") 
    f2.write(" ".join(x) + " " + str(len(set(sem) & set(x)))+"\n") 
f2.close() 

我的文件看起來像這樣:

IL-2$gene$expression$and 
IL-2$gene$expression$and$NF-kappa 
IL-2$gene$expression$and$NF-kappa$B 
IL-2$gene$expression$and$NF-kappa$B$activation 
gene$expression$and$NF-kappa$B$activation$through 
expression$and$NF-kappa$B$activation$through$CD28 

我的電流輸出

IL-2 gene expression and 1 
IL-2 gene expression and NF-kappa 1 
IL-2 gene expression and NF-kappa B 1 
IL-2 gene expression and NF-kappa B activation 1 
gene expression and NF-kappa B activation through 1 
expression and NF-kappa B activation through CD28 0 

我的期望輸出

Token           cells gene factor……. promoter 
IL-2 gene expression and       0  1  0  ………  0 
IL-2 gene expression and NF-kappa     0  1  0  ………  0 
IL-2 gene expression and NF-kappa B    0  1  0  ………  0 
IL-2 gene expression and NF-kappa B activation 0  1  0  ………  0 
gene expression and NF-kappa B activation through 0  1  0  ………  0 
expression and NF-kappa B activation through CD28 0  0  0  ………  0 

我認爲需要在代碼中稍微改變一下我認爲這樣才能通過嵌套循環來解決。但我怎麼樣,我不知道。我這樣做的代碼是低於該不工作

sem = ["cells", "b","expression", "cell", "gene","factor","activation","protein","activity","transcription","alpha","receptor","t","promotor","mrna","site","kinase","nfkappa","human"]; 
    f2 = open("C:/Python26/Semantics.txt",'w') 
    with open("C:/Python26/trigram.txt") as file : 
    for s in sem: 
     for lines in file: 
      lines = lines.strip().split("$") 
      if s==lines: 
       f2.write(" ".join(lines) + "\t" +str(len(set(sem) & set(lines)))+"\n") 
     f2.write("\n") 
    f2.close() 
+2

http://stackoverflow.com/queue stions/5676646 /,填寫-A-python的字符串與 - 空間 – huxley 2015-04-04 09:17:40

回答

0

pandas.DataFrame

數據幀是2維標記的數據結構與 潛在不同類型的列。您可以將它想象爲電子表格或SQL表格或Series對象的字典。

您可以創建您的DataFrame對象,然後將其轉換爲一個字符串並將write()串入您的文件。

import pandas 

col_labels = ['Token', 'cells', 'gene'] 
row_labels = ['x', 'y', 'z'] 

values_array = [[1, 2, 3], 
       [10, 20, 30], 
       [100, 200, 300]] 

df = pandas.DataFrame(values_array, col_labels, row_labels)  
print(df) 

輸出

  x y z 
Token 1 2 3 
cells 10 20 30 
gene 100 200 300 

要保存它,對象首先轉換爲字符串:

db_as_str = df.to_string() 

with open('my_text_file.txt', 'w') as f: 
    f.write(db_as_str) 

或保存爲是,在CSV:

db.to_csv('my_text_file.txt') 
相關問題