2014-09-21 93 views
1

給定兩個文件A和B,是否有方法可以編輯B中與A中的字符串重疊時匹配兩個文件的字符串的字體,顏色等?不匹配的字符串應保持原樣,因此輸出文件應保持與輸入相同的長度。匹配兩個文件之間的行並標記匹配的字符串

例子:

文件中的

NM_134083 mmu-miR-96-5p NM_134083  0.96213 -0.054 
NM_177305 mmu-miR-96-5p NM_177305  0.95707 -0.099 
NM_026184 mmu-miR-93-3p NM_026184  0.9552 -0.01 

文件B

NM_134083 
NM_177305 
NM_17343052324 

輸出

**NM_134083** mmu-miR-96-5p **NM_134083**  0.96213 -0.054 
**NM_177305** mmu-miR-96-5p **NM_177305**  0.95707 -0.099 
+2

應如何我想象,沒有任何的例子嗎? – user1767754 2014-09-21 13:22:36

+1

請執行以下操作:http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – lilster 2014-09-21 13:54:33

+0

爲什麼標記爲R? – 2014-09-21 15:24:25

回答

1

你給原始文本,並且不指定那種格式化你的想做。留下格式化詳細信息,是的,您可以用格式化的內容替換FileB中也在FileB中的文本。

import re 
with open('fileA.txt') as A: 
    A_content=[x.strip() for x in A] 
with open('fileB.txt') as B: 
    B_content=[x.strip() for x in B] 
output=[] 
for line_A in A_content: 
    for line_B in B_content: 
     #do whatever formatting you need on the text, 
     # I am just surrounding it with *'s here 

     replace = "**" + line_B + "**" 

     #use re.sub, 
     # details here: https://docs.python.org/2/library/re.html#re.sub 

     line_A = re.sub(line_B, replace , line_A) 
    #I am adding everything to the output array but you can check if it is 
    # different from the initial content. I leave that for you to do 
    output.append(line_A) 

輸出

**NM_134083** mmu-miR-96-5p **NM_134083**  0.96213 -0.054 
**NM_177305** mmu-miR-96-5p **NM_177305**  0.95707 -0.099 
NM_026184 mmu-miR-93-3p NM_026184  0.9552 -0.01 
+0

粗體格式將如何顯示? – user3741035 2014-09-21 16:32:21

相關問題