2014-09-22 41 views
-1

我是python的初學者,希望學習如何替換來自不同文件的文本。如何在python中逐行讀取和替換文本中的文本

我知道如何做到這一點基本的,但需要這方面的幫助:

我有3個文件main.txtnames.txt中number.txt

names.txt中看起來是這樣的:

Anna 
Smith 
Bob 
Jhon 

number.txt看起來是這樣的:

1-522-223 
1-523-232 
1-593-573 
1-322-242 

文件names.txt中1號線與number.txt線1對應(所以安娜的手機是第一個number.txt,史密斯手機在number.txt第二等)

現在這裏有一個問題: 文件main.txt看起來是這樣的:

The person Judy lives in Ontario and has phone number 1-888-2923 
The person Michael lives in Toronto and has phone number 1-999-2388 
The person Cameron lives in Berlin and has phone number 1-666-2888 
The person Douglas lives in Tokyo and has phone number 5-7777-223 

我知道如何查找和替換,問題是我需要相應改變的電話和名字在每行中main.txt行numbers.txt and names.txt。 所以編輯main.txt應該是:

The person Anna lives in Ontario and has phone number 1-522-223 
The person Smith lives in Toronto and has phone number 1-523-232 
The person Bob lives in Berlin and has phone number 1-593-573 
and so on... 

我真的不知道如何做到這一點,而文件是相當大的,像2000行文字。誰能幫我 ?

+0

你會發布你迄今爲止寫的代碼嗎? – user2314737 2014-09-22 09:52:33

+0

爲什麼三個文件中的信息都是以?開頭的?這很容易失去同步,可能會考慮使用具有不同列中信息的CSV樣式文件。 – Werner 2014-09-22 09:55:14

回答

1

您可以從每個文件壓縮每一行一次性更新並重新寫出。我爲輸出使用了一個新文件。

COL_NAME = 2 
COL_PHONENUM = -1 

with open('new_main.txt', 'w') as outfile: 
    # zip corresponding lines from each file 
    for entry in zip(open('names.txt'), open('number.txt'), open('main.txt')): 
     main_data = entry[2].split() 
     main_data[COL_NAME] = entry[0].strip() 
     main_data[COL_PHONENUM] = entry[1].strip() 
     outfile.write('{}\n'.format(' '.join(main_data))) 

內容的new_main.txt

$ cat new_main.txt 
The person Anna lives in Ontario and has phone number 1-522-223 
The person Smith lives in Toronto and has phone number 1-523-232 
The person Bob lives in Berlin and has phone number 1-593-573 
The person Jhon lives in Tokyo and has phone number 1-322-242 
+0

太棒了!對zip沒有任何想法,只是閱讀文檔,它似乎是最好的方法,非常感謝! – 2014-09-22 10:34:53

1

AS號是主最後的方法([-1])和名稱3TH([2])可以拆分在主線路和更換name和數量:

with open('name.txt','r') as n: 
    names =n.readlines() 
    n.close() 

with open('number.txt','r') as n: 
    numbers =n.readlines() 
    n.close() 

with open('main.txt','r') as n: 
    main =n.readlines() 
    n.close() 

newmain=[] 
for i in main 
    for j,k in zip(names,numbers): 
     i.split()[2]=j 
     i.split()[-1]=k 
     newmain.append(i) 

newmain=['',join(i) for i in newmain] 

with open('main.txt','w') as n: 
    main =n.write(str(newmain)) 
    n.close() 
+0

謝謝,我不知道什麼郵編! – 2014-09-22 10:35:19

+0

它一起壓縮它的itrator參數! (l,k) [(1,'a')) ,(2,'b'),(3,'c')] >>> ' – Kasramvd 2014-09-22 10:40:02

0

所以,你有3個數據集:

  • 城市
  • 數字

假設3組,每組項之間的關係是由它們的位置給定(#1 NAE去與#1市1號電話),你將不得不

  1. 提取主要城市列表。TXT(例如使用正則表達式)
  2. 組織數據(列表或字典)
  3. 使用的模板字符串格式重建一個新的main.txt

讓我們去:

def extractCities(path_to_main_txt_file): 
    '''takes a path to txt file 
    returns a list of cities''' 

    import re 

    with open(path_to_main_txt_file, 'r') as f: 
     txt = f.read() 

    return re.findall('in (.*) and', l) 


def organizeData(names, cities, numbers): 
    '''takes 3 lists 
    returns 1 nested list''' 

    return [[n, cities[names.index(n)], numbers[names.index(n)]] for n in names] 

用法

>>> with open(r'path/to/names.txt') as f: 
     names = f.read().splitlines() 
>>> with open(r'path/to/numbers.txt') as f: 
     numbers = f.read().splitlines() 
>>> cities = extractCities(r'path/to/main.txt') 
>>> data = organizeData(names, cities, numbers) 
>>> template = u'The person {p} lives in {c} and has phone number {n}\n' 
>>> main = [template.format(p=d[0], c=d[1], n=d[2]) for d in data] 

現在main包含字符串列表:您可以在寫新文件,覆蓋原始文件...

+0

與@Kasra同時發帖:或多或少採用相同的方法... – outforawhile 2014-09-22 10:47:35

+0

非常感謝您花時間寫這篇文章,我非常感謝並從所有這些不同的方法中學到了很多東西。爲您+1網絡! – 2014-09-22 11:20:52