2017-08-08 51 views
-1

我有一個py腳本(致謝alexander from Comparing large files with grep or python)調試兩個字符串列表。使用python調試列表

現在我想修改調試名單,並刪除重複的字符串:

filename_1 = 'A.txt' 
filename_2 = 'B.txt' 
filename_3 = 'C.txt' 
with open(filename_1, 'r') as f1, open(filename_2, 'r') as f2, open(filename_3, 'w') as fout: 
    s = set(val.strip() for val in f1.readlines()) 
    for row in f2: 
     row = row.strip() 
     if row not in s: 
      fout.write(row + '\n') 

列表的內容:

A.txt 
string1 
string2 

B.txt 
string1 
string3 

預期的結果:

C.txt 
string1 
string2 
string3 

感謝

PD:我是新人,我很抱歉。我真正需要的是從列表A中刪除B的內容。無論如何,謝謝。

這就是答案,已經研究(3例):

取下A.TXT列表並退出B.txt內容C.txt

a=set(line.strip().lower() for line in open('A.txt').readlines()) 
b=set(line.strip().lower() for line in open('B.txt').readlines()) 
open("C.txt", "w").write("\n".join(a.difference(b))) 

比較A.TXT和B.txt並顯示新線B.txt在C.txt

a=set(line.strip().lower() for line in open('A.txt').readlines()) 
b=set(line.strip().lower() for line in open('B.txt').readlines()) 
open("C.txt", "w").write("\n".join(b.difference(a))) 

合併A.TXT和B.txt的內容複製到C.txt

a=set(line.strip().lower() for line in open('A.txt').readlines()) 
b=set(line.strip().lower() for line in open('B.txt').readlines()) 
open("C.txt", "w").write("\n".join(b | a)) 
+2

等等等等你試過了什麼,你遇到了什麼問題? –

+0

刪除重複的字符串 – acaler

+0

從listA和listB中刪除重複的字符串,然後將結果保存到列表C?如果是這樣,爲什麼listC在listA和listB中重複時包含'string2'? –

回答

1

該文件的第一部分包含f2中的那些不在f1中的項目,因此只需將所有f1的內容添加到結果中即可。

with open(filename_1, 'r') as f1, open(filename_2, 'r') as f2, open(filename_3, 'w') as fout: 
    s = set(val.strip() for val in f1.readlines()) 
    for row in f2: 
     row = row.strip() 
     if row not in s: 
      fout.write(row + '\n') 
    for row in s: 
     fout.write(row + '\n')