2016-06-12 85 views
0

我想將字符串列表轉換爲小寫字母並刪除重複項,同時保留順序。我在StackOverflow上找到的很多單行Python魔法將字符串列表轉換爲小寫字母,但似乎命令丟失了。將字符串列表轉換爲唯一的小寫,保留順序(python 2.7)

我已經寫了下面的代碼實際工作,我很高興堅持下去。但是我想知道是否有一種方法可以實現更多的pythonic和更少的代碼(如果我將來編寫類似的東西,可能會出現更少的bug,這讓我花了很長時間才寫出來)。

def word_list_to_lower(words): 
    """ takes a word list with a special order (e.g. frequency) 
    returns a new word list all in lower case with no uniques but preserving order""" 

    print("word_list_to_lower")  
    # save orders in a dict 
    orders = dict() 
    for i in range(len(words)): 
     wl = words[i].lower() 

     # save index of first occurence of the word (prioritizing top value)   
     if wl not in orders: 
      orders[wl] = i 

    # contains unique lower case words, but in wrong order 
    words_unique = list(set(map(str.lower, words))) 

    # reconstruct sparse list in correct order 
    words_lower = [''] * len(words) 
    for w in words_unique: 
     i = orders[w] 
     words_lower[i] = w 

    # remove blank entries 
    words_lower = [s for s in words_lower if s!=''] 

    return words_lower 

回答

1

略有How do you remove duplicates from a list in whilst preserving order?

def f7(seq): 
    seen = set() 
    seen_add = seen.add 
    seq = (x.lower() for x in seq) 
    return [x for x in seq if not (x in seen or seen_add(x))] 
+0

哇,謝謝,太棒了。 seen_add也有趣的見解 – memo

+0

'seen_add(...)'比'seen.add(...)'更好嗎?國際海事組織,情況更糟。 – zondo

+0

如果在定義'seq'時使用了括號'()'而不是括號'[]',它會更高效一些。這是因爲您創建了一個按需提供值的生成器,而不是需要將每個值存儲在內存中的列表。 – zondo

0

修改答案就在做這樣的事情:

initial_list = ['ONE','one','TWO','two'] 
uninique_list = [x.lower() for x in list(set(initial_list))] 

print unique_list 
+0

問題的關鍵之一是必須保留訂單。您的解決方案不會保留訂單。 – zondo

1

你也可以這樣做:

pip install orderedset 

,然後:

from orderedset import OrderedSet 
initial_list = ['ONE','one','TWO','two','THREE','three'] 
unique_list = [x.lower() for x in list(OrderedSet(initial_list))] 

print unique_list 
0
initial_list = ['ONE','one','TWO','two'] 
new_list = [] 
[new_list.append(s.lower()) for s in initial_list if s.lower() not in new_list] 
相關問題