2013-03-22 190 views
0

我有一個文本字符串文件包含我想要替換的名稱的文件。我有另一個文件有兩列,A和B包含名稱。列A包含與字符串(文件1)中相同的名稱。我基本上想用B列中的名稱來替換這些名稱。我嘗試過使用Python,但我仍然太過於習慣於將它拉下來。任何指針將不勝感激。python搜索和替換

 

File1    
NameA.....NameB....NameC....etc 

File2     
A  B  
NameA NameD   
NameB NameE   
NameC NameF 

想;

 
File1      
NameD....NameE....NameF....etc 

+2

[你嘗試過什麼(http://www.whathaveyoutried.com)不工作 – AlG 2013-03-22 16:57:17

+0

我應該補充說,當然是見 – 2013-03-23 14:03:42

回答

0

我會考慮使用RegEx(Python中的re模塊)。這將允許您創建可以搜索特定文本模式的函數。如果您正確地構造了re.compile()函數和re.search()函數,則可以使用group()函數提取文本的選擇「組」。該庫是相當廣泛的,所以這裏是對文檔的鏈接:

http://docs.python.org/2/library/re.html

我也想看看的在線教程,比如這一個:

http://www.youtube.com/watch?v=DRR9fOXkfRE

1
#read filrst file as list 
with open("file1") as f: 
    names1=f.read().strip().split(); 

#read file2 as dictionary 
with open("file2") as f: 
    names2=dict(i.strip().split() for i in f.readlines()) 

#write replacement in file3 
with open("file3","w") as f: 
    f.write(" ".join(names2[i] for i in names1)) 
0

我認爲你需要這樣的代碼:

File1 = open("File1", "r") 
File2 = open("File2", "r") 
File3 = open("File3","w") 

for line in File2: 

    A, B = line.strip().split('\t') 

    for line_string in File1: 

     line_string.replace(A,B) 

     File3.write('%s\n' % line_string) 

File3.close() 
1
with open('File1', 'r') as fd: 
    keys = fd.read().split() 

name_map = {} 

with open('File2', 'r') as fd: 
    for line in fd.readlines(): 
     key, value = line.split() 
     name_map[key] = value 

with open('File1', 'w') as fd: 
    new_names = [] 
    for k in keys: 
     new_names.append(name_map[k]) 
    fd.write(" ".join(new_names)) 
0

感謝您的回覆。雖然沒有一個確實沒有正常工作。可能是由於file1中字符串的性質(newick格式)。這是我原來的工作......可能不太好。雖然,如果我能得到一個替代函數的工作,它可能會伎倆..?

import re 

LineString = open("file1.txt", "r").read() 

pattern = re.compile('\d+OTU\_\d+\_\w+\_\d+') 
words = pattern.findall(LineString) 

colA = [] 
colB = [] 

with open("file2.txt", "r") as f: 
for line in f: 
    parts = line.split() 
    if len(parts) > 0: 
     colA.append(parts[0]) 
    if len(parts) > 1: 
     colB.append(parts[1]) 

#Doesnt work 
if words == colA: 
LineString.replace(colA, colB) 

字符串file1中一個看起來像:(((((((((( '1OTU_1_769_wint_446':0.00156420, '1OTU_1_822_wint_445':0.00000000)0.5700:0.00156410, '1OTU_1_851_wint_454':0.00000000)等...

話,可樂,COLB樣子:?如1OTU_1_769_wint_446

+0

將RegEx搜索與Yarkee提供的字典示例結合起來就像一個魅力一樣。謝謝。 – 2013-03-23 14:56:15

+0

您應該將它合併到您的闕中所有人都能看到它。 – AlG 2013-03-23 22:28:40