在Python中添加唯一值到列表中

我正在學習python。這裏是練習的相關部分：在Python中添加唯一值到列表中

對於每個單詞，檢查單詞是否已經在列表中。如果單詞不在列表中，請將其添加到列表中。

這是我得到的。

fhand = open('romeo.txt') 
output = [] 

for line in fhand: 
    words = line.split() 
    for word in words: 
     if word is not output: 
      output.append(word) 

print sorted(output)

這是我得到的。

['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'and', 'and', 'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'is', 'is', 'kill', 'light', 'moon', 'pale', 'sick', 'soft', 'sun', 'sun', 'the', 'the', 'the', 'through', 'what', 'window', 'with', 'yonder']

注意重複（和，是，太陽等）。

如何獲得唯一值？

來源

2017-02-19 Tim Elhajj

慣用的方法是維護一組*的字來檢查。在越來越多的列表中，所有這些線性掃描使得另外的線性算法降級爲二次。 –

使用此選項。

if word not in output: 
     output.append(word)

來源

2017-02-19 23:30:49

代替is not運營商，你應該使用not in運營商的檢查項目是否是列表：

if word not in output:

BTW，使用set很多有效的（見Time complexity）：

with open('romeo.txt') as fhand: 
    output = set() 
    for line in fhand: 
     words = line.split() 
     output.update(words)

UPDATEset不保留原始順序。要保存訂單，請使用該集作爲輔助數據結構：

output = [] 
seen = set() 
with open('romeo.txt') as fhand: 
    for line in fhand: 
     words = line.split() 
     for word in words: 
      if word not in seen: # faster than `word not in output` 
       seen.add(word) 
       output.append(word)

來源

2017-02-19 23:30:36 falsetru

謝謝你們。我感謝幫助 –

該練習要求列出單詞將按照其首次出現的順序進行排序，因此我不明白'set（）'可以*替換列表，儘管它顯然會一個有用的輔助數據結構。 –

@JohnColeman，感謝您的評論。我認爲這並不重要，因爲OP在代碼的末尾使用了'sorted'。我會更新答案以包含保存訂單的版本。 – falsetru

這裏的「一個班輪」，它採用刪除重複的this implementation同時維持秩序：

def unique(seq): 
    seen = set() 
    seen_add = seen.add 
    return [x for x in seq if not (x in seen or seen_add(x))] 

output = unique([word for line in fhand for word in line.split()])

最後一行變平fhand成單詞列表，然後將得到的名單上呼籲unique() 。

來源

2017-02-19 23:53:13

在Python中添加唯一值到列表中

回答

相關問題