修改每一行的文本文件在Python

我有一個大文件，像下面的例子：修改每一行的文本文件在Python

1 10161 10166 3 
1 10166 10172 2 
1 10172 10182 1 
1 10183 10192 1 
1 10193 10199 1 
1 10212 10248 1 
1 10260 10296 1 
1 11169 11205 1 
1 11336 11372 1 
2 11564 11586 2 
2 11586 11587 3 
2 11587 11600 4 
3 11600 11622 2

我想在每行開頭加上「CHR」，例如：

chr1 10161 10166 3 
chr1 10166 10172 2 
chr1 10172 10182 1 
chr1 10183 10192 1 
chr1 10193 10199 1 
chr1 10212 10248 1 
chr1 10260 10296 1 
chr1 11169 11205 1 
chr1 11336 11372 1 
chr2 11564 11586 2 
chr2 11586 11587 3 
chr2 11587 11600 4 
chr3 11600 11622 2

我嘗試在Python下面的代碼：

file = open("myfile.bg", "r") 
    for line in file: 
     newline = "chr" + line 
    out = open("outfile.bg", "w") 
    for new in newline: 
     out.write("n"+new)

但沒有返回我想要的東西。你知道如何解決這個問題的代碼嗎？

來源

2017-10-04 user7249622

1）你必須連接上換行符的字符串（如+ =）我的版本 2）請郵寄的結果，或者任何 – Thecave3

錯誤現在不需要了，因爲問題已經得到解答，但如果您可以包含您所看到的輸出，這通常會很有幫助。 – ryachza

的問題是你迭代的輸入和再設定相同的變量（newline）爲每一行，然後打開文件的輸出值並迭代newline它是一個字符串，所以new將在該字符串中的每個字符。

我覺得這樣的事情應該是你在找什麼：

with open('myfile.bg','rb') as file: 
    with open('outfile.bg','wb') as out: 
    for line in file: 
     out.write('chr' + line)

當遍歷文件，line應該已經包含了結尾的新行。

with語句將在塊結束時自動清理文件句柄。

來源

2017-10-04 18:09:44 ryachza

@thebjorn什麼不行？當我測試它時，它看起來很完美。你看到了什麼輸出？ – ryachza

與您的代碼的問題是，你遍歷輸入文件，而不與數據做任何你讀到：

file = open("myfile.bg", "r") 
for line in file: 
    newline = "chr" + line

最後一行分配在myfile.bg到newline變量（一個字符串的每一行，用'chr'前置），每行覆蓋前一個結果。

然後你遍歷字符串中newline（這將是在輸入文件的最後一行，與'chr'預謀）：

out = open("outfile.bg", "w") 
for new in newline:  # <== this iterates over a string, so `new` will be individual characters 
    out.write("n"+new) # this only writes 'n' before each character in newline

如果你只是在做這一次，例如在外殼，你可以使用一個班輪：

open('outfile.bg', 'w').writelines(['chr' + line for line in open('myfile.bg').readlines()])

更正確的（尤其是在一個程序中，在那裏你會在乎打開的文件句柄等）將是：

with open('myfile.bg') as infp: 
    lines = infp.readlines() 
with open('outfile.bg', 'w') as outfp: 
    outfp.writelines(['chr' + line for line in lines])

如果文件是真的大（接近可用內存的大小），你需要逐步處理它：

with open('myfile.bg') as infp: 
    with open('outfile.bg', 'w') as outfp: 
     for line in infp: 
      outfp.write('chr' + line)

（這比第t慢得多窩版本雖然..）

來源

2017-10-04 18:11:26 thebjorn

只有我在這裏看到的是內存使用情況，如果文件很大。 – ryachza

臨時文件試圖解決什麼問題？我唯一的想法是，如果有一個敵對的讀者可以在寫作時打開它，但是由於緩衝，這將是任何文件大小的問題。 – ryachza

你不能打開同一個文件進行閱讀和寫作，特別是在這裏，因爲你正在寫更多的數據而不是你正在閱讀的內容，你最終會讀取新數據而不是舊數據。直到你的文件大小超過你的stdio緩衝區，這個問題纔可能出現，儘管.. – thebjorn

完全符合@rychaza同意，這是一個使用你的代碼

file = open("myfile.bg", "r") 
out = open("outfile.bg", "w") 
for line in file: 
    out.write("chr" + line) 
out.close() 
file.close()

來源

2017-10-04 18:12:21 Thecave3

您無法打開相同的文件以進行輸入和輸出（至少在大於stdio緩衝區大小的情況下不會）。另外你正在泄漏文件句柄。 – thebjorn

@thebjorn答案並不是 - 輸入和輸出文件不同。 – ryachza

啊，對不起，我的壞。 – thebjorn

修改每一行的文本文件在Python

回答

相關問題