比較兩個文件，並替換

有超過1000條文件1，例如：比較兩個文件，並替換

:) 
still good 
not 
candy....wasn't even the good stuff. 
how could i ever forget? #biggestdayoftheyear 
not even think 
will be

有超過1000行file2中，例如：

1,even,2 
2,be,1 
3,good,2 
4,:),1 
5,forget?,1 
6,i,1 
7,stuff.,1 
8,#biggestdayoftheyear,1 
9,think,1 
10,will,1 
11,how,1 
12,not,2 
13,the,1 
14,still,1 
15,ever,1 
16,could,1 
17,candy....wasn't,1

代碼：

file1 = 'C:/Users/Desktop/file1.txt' 
file2 = 'C:/Users/Desktop/file2.txt' 

with open(file1) as f1: 
    for line1 in f1: 
     sline1 = str(line1.strip().split(' ')) 
     print sline1 

with open(file2) as f2: 
    for line2 in f2: 
     sline2 = line2.split(',') 
     #print sline2[0], sline2[1] 
     if sline2[1] in sline1: 
      print sline1.replace(sline1, sline2[0])

從代碼結果只顯示：

2 
6 
10

我錯過了什麼？任何建議？

我想從file2的第1列中的數字替換file1中的所有單詞，從第2列檢查它們是否是相同的單詞。

預期的結果：

4 
14 3 
12 
17 1 13 3 7 
1 16 6 15 5 8 
12 1 9 
10 2

來源

2014-02-26 ThanaDaray

你的問題是什麼？是不是像你期望的那樣工作？ –

對不起，我忘了提及。 – ThanaDaray

兩個文件中的行是否以特定順序排列？必須將文件1中的第一行與文件2中的第一行進行比較，還是必須對文件1中的每一行循環遍歷文件2中的所有行？如果你確實找到了一場比賽，你需要突圍還是繼續尋找更多的比賽？ – sabbahillel

你將需要建立從文件2的inverted index。

inverted_index = {} 
with open(file2) as f2: 
    for line in f2: 
     key, value, _ = line.split(',') 
     inverted_index[value] = key

然後，使用該倒排索引，通過文件1檢查，而你循環：

with open(file1) as f1: 
    for line in f1: 
     print ' '.join([inverted_index.get(word, word) for word in line.strip().split(' ')])

來源

2014-02-26 14:09:13 Menno

我注意到，你遍歷文件1，並設置sline1明確。在退出循環後，循環遍歷文件2進行比較。因此，您只會處理sline1的最後一個值（因爲您已退出該循環）。一旦你建立了如Menno所示的詞典倒排索引，你就可以設置替換過程。

來源

2014-02-26 14:14:51 sabbahillel

比較兩個文件，並替換

回答

相關問題