在python中追加

我想打開一個文件並逐行讀取它。對於每一行我都想使用split（）方法將行分割成單詞列表。然後我想檢查每一行的每一個單詞，看看這個單詞是否已經在列表中，如果沒有將它追加到列表中。這是我寫的代碼。在python中追加

fname = raw_input("Enter file name: ") 
fh = open(fname) 
line1 = list() 
for line in fh: 
    stuff = line.rstrip().split() 
    for word in stuff: 
     if stuff not in stuff: 
      line1.append(stuff) 
print line1

我的問題是，當我打印出line1它打印出這樣的格式約30重複列表。

['But', 'soft', 'what', 'light', 'through', 'yonder', 'window', 'breaks'], 
['But', 'soft', 'what', 'light', 'through', 'yonder', 'window', 'breaks'], ['It', 'is', 'the', 'east', 'and', 'Juliet', 'is', 'the', 'sun'], 
    ['It', 'is', 'the', 'east', 'and', 'Juliet', 'is', 'the', 'sun'] 
    ['Arise', 'fair', 'sun', 'and', 'kill', 'the', 'envious', 'moon'], 
    ['Arise', 'fair', 'sun', 'and', 'kill', 'the', 'envious', 'moon'],

我想知道爲什麼會發生這個問題，以及如何刪除重複的單詞和列表。

來源

2016-03-14 David Asmah

不確定你想要做什麼，但我有一種感覺，「如果東西不是東西」至少會傷害你一點點 – inspectorG4dget

你的情況是'如果東西沒有東西：'。我認爲你的意思是「如果不在list1中：」？如果情況並非如此，你能否更清楚地解釋你想要發生什麼？ –

您有if stuff not in stuff。如果您將該行更改爲if word not in line1:，並將下一行更改爲line1.append(word)，則您的代碼應該可以正常工作。

或者，使用集合。

fname = raw_input("Enter file name: ") 
fh = open(fname) 
line1 = set() 
for line in fh: 
    stuff = line.rstrip().split() 
    for word in stuff: 
     line1.add(word) 
print line1

甚至

fname = raw_input("Enter file name: ") 
fh = open(fname) 
line1 = set() 
for line in fh: 
    stuff = line.rstrip().split() 
    line1 = line1.union(set(stuff)) 
print line1

集將只包含唯一的值（雖然他們沒有排序或索引的概念），這樣你就不會需要處理檢查單詞是否已經拿出早已：設置的數據類型自動處理。

來源

2016-03-14 18:09:54

在python中追加

回答

相關問題