Python - 結合文本文件（特定行）

我有兩個來自一個實驗的大型文本數據文件，我想用特殊方式將它分割成一個文件。Python - 結合文本文件（特定行）

數據的小樣本：

文件1：

plotA 10 
plotB 9 
plotC 9

文件2：

而且我想結果是這樣的：

plotA 10 98% 7/10 21 
plotB 9 98% 5/10 20 
plotC 9 98% 10/10 21

我不知道它如何在python中解決。我試圖重新排序文件2具有：

lines = file2.readlines() 
aaa = lines[0] + lines[3] + lines[6] 
bbb = lines[1] + lines[4] + lines[7] 
ccc = lines[2] + lines[5] + lines[8]

，並使用拉鍊，但我失敗了（這方法耗時大的文本文件）。

任何幫助？

來源

2015-08-26 Hawk81

您可以使用itertools.izip_longest切片文件2三重線，然後再次使用使用它與第一個文件來壓縮他們：

from itertools import izip_longest 
with open('file1.txt') as f1, open('file2.txt') as f2: 

    args = [iter(f2)] * 3 
    z = izip_longest(f1, izip_longest(*args), fillvalue='-') 
    for line, tup in z: 
      print '{:11}'.format(line.strip()), '{:5}{:5}{:>5}'.format(*map(str.strip, tup))

如果你想要這個結果寫入到一個新的文件，你可以打開文件寫入，而不是打印它寫入文件中的行。

結果：

plotA 10 98% 7/10 21 
plotB 9 98% 5/10 20 
plotC 9 98% 10/10 21

來源

2015-08-26 19:18:56 Kasramvd

引發'TypeError sequence item 1：expected string，tuple found'，因爲z中的行是一個嵌套元組，例如， '（'plotA 10 \ n'，（'98％\ n'，'5/10 \ n'，'20 \ n'））' –

@AndyKubiak這是編輯前的帖子，多年前已經編輯過; - ） – Kasramvd

是的，看到你的編輯。幾乎到了那裏。想象一下，我會放下一個格式化的櫻桃頂部爲雅。 –

下面是一個例子，你就會有錯誤處理和所有改進它：^）

file1 = open('file1') 
file2 = open('file2') 

# take one line in file1 
for line in file1: 
     # print result with tabulation to separate fields 
     print '\t'.join(
       # the line from file1 
       [line.strip()] + 
       # and three lines from file2 
       [file2.readline().strip() for _ in '123'] 
     )

請注意，我使用的字符串'123'因爲它比range(3)短（並且不需要函數調用）;它必須是任何產生三個步驟的迭代器。

只讀取所需的數據並進行處理，避免了在內存中加載所有文件的需要（正如您所說的文件很大）。

乾杯。

來源

2015-08-26 19:19:04 bufh

爲了清晰起見，我會將你的'123'修改爲範圍（3），而另一個可能會更快，爲什麼混淆這種情況。 –

感謝您的評論邁克爾，雖然代碼縮減後的評論對於初學者來說已經足夠清晰，但您可能是對的。原諒我離開它，但@Kasramvd解決方案更pythonic，應該被接受：^） – bufh

謝謝@bufh。看到其他可行的解決方案總是很好的！ – Hawk81

Python - 結合文本文件（特定行）

回答

相關問題