比較兩個文本文件（順序並不重要）和輸出的字的兩個文件有共同到第三個文件

我剛開始編程，我試圖來比較兩個文件看起來像這樣：比較兩個文本文件（順序並不重要）和輸出的字的兩個文件有共同到第三個文件

file1: 
tootsie roll 
apple 
in the evening 

file2: 
hello world 
do something 
apple 

output: 
"Apple appears x times in file 1 and file 2"

我真的難住了。我試圖創建字典，列表，元組，集合，我似乎無法得到我想要的輸出。我得到的最接近的是輸出的行完全如file1/file2所示。

我已經嘗試了幾個代碼片段，我似乎無法得到任何他們輸出我想要的。任何幫助將不勝感激！！

這是我試過的最後一段代碼，它沒有給我任何輸出給我的第三個文件。

f1 = open("C:\\Users\\Cory\\Desktop\\try.txt", 'r') 
f2 = open("C:\\Users\\Cory\\Desktop\\match.txt", 'r') 
output = open("C:\\Users\\Cory\\Desktop\\output.txt", 'w') 

file1 = set(f1) 
file2 = set(f2) 
file(word,freq) 
for line in f2: 
    word, freq = line.split() 
    if word in words: 
     output.write("Both files have the following words: " + file1.intersection(file2)) 
f1.close() 
f2.close() 
output.close()

來源

2015-11-20 Cory Gottfried

你到底要什麼輸出？ – vincent

我希望我的第三個文件具有與文件中匹配的每個單詞的輸出（例如，如果apple是文件1中的任何位置，apple是文件2中的任何位置，則會得到Apple的輸出：x（x = number的時間蘋果出現在這兩個文件），然後我想知道這個詞在這兩個文件中有多少。 –

你並不需要所有這些循環 - 如果文件很小（即小於幾百MB），你可以與他們的工作更直接：

words1 = f1.read().split() 
words2 = f2.read().split() 
words = set(words1) & set(words2)

words後會有一個set包含這些文件共有的所有單詞。在分割文本之前，您可以使用lower()來忽略大小寫。

要讓每個單詞的計數，你在評論提到，只需使用count()方法：

with open('outfile.txt', 'w') as output: 
    for word in words: 
     output.write('{} appears {} times in f1 and {} times in f2.\n'.format(word, words1.count(word), words2.count(word))

來源

2015-11-20 02:20:15 TigerhawkT3

比較兩個文本文件（順序並不重要）和輸出的字的兩個文件有共同到第三個文件

回答

相關問題