使用'的.index（）`上重複字母

我構建構建與詞，如字典的功能：使用'的.index（）`上重複字母

{'b': ['b', 'bi', 'bir', 'birt', 'birth', 'birthd', 'birthda', 'birthday'], 
'bi': ['bi', 'bir', 'birt', 'birth', 'birthd', 'birthda', 'birthday'], 
'birt': ['birt', 'birth', 'birthd', 'birthda', 'birthday'], 
'birthda': ['birthda', 'birthday'], 
'birthday': ['birthday'], 
'birth': ['birth', 'birthd', 'birthda', 'birthday'], 
'birthd': ['birthd', 'birthda', 'birthday'], 
'bir': ['bir', 'birt', 'birth', 'birthd', 'birthda', 'birthday']}

這是什麼樣子：

def add_prefixs(word, prefix_dict): 
    lst=[] 
    for letter in word: 
     n=word.index(letter) 
     if n==0: 
      lst.append(word[0]) 
     else: 
      lst.append(word[0:n]) 
    lst.append(word) 
    lst.remove(lst[0]) 
    for elem in lst: 
     b=lst.index(elem) 
     prefix_dict[elem]=lst[b:] 
    return prefix_dict

它適用於像「生日」這樣的詞，但是當我有一封重複的信時，我遇到了一個問題，比如「你好」。

{'h': ['h', 'he', 'he', 'hell', 'hello'], 'hell': ['hell', 'hello'], 'hello': ['hello'], 'he': ['he', 'he', 'hell', 'hello']}

我知道這是因爲指數（蟒蛇選擇的第一次出現的字母索引），但我不知道如何解決它。是的，這是我的家庭作業，我真的想向你們學習:)

來源

2012-11-27 Yarden

這是爲什麼？它的工作原理 – Yarden

@Yarden：那是因爲我用空格替換了標籤。編輯器使用8個空格作爲製表符，而代碼僅使用4個空格進行渲染，導致您的縮進在您手動縮進的第一行關閉。 –

你已經循環了這個詞;保留一個櫃檯，而不是使用.index()。 Python對你來說非常簡單;使用enumerate()功能：

for n, letter in enumerate(word): 
    if n==0: 
     lst.append(word[0]) 
    else: 
     lst.append(word[0:n])

現在你不再使用的letter可變的，所以只是range(len(word)代替：

for n in range(len(word)): 
    if n==0: 
     lst.append(word[0]) 
    else: 
     lst.append(word[0:n])

我們可以簡化，這歸因於列表理解：

lst = [word[0:max(n, 1)] for n in range(len(word))]

注意那裏有max();我們不是測試n是否爲0，而是爲切片設置最小值1。

既然你然後進行再次刪除的第一項（因爲它是一樣的第二個結果）和你加滿的話，只需添加1到n櫃檯改爲：

lst = [word[0:n+1] for n in range(len(word))]

你的函數下半年可以有效地使用enumerate()功能，而不是.index()：

for b, elem in enumerate(lst): 
    prefix_dict[elem]=lst[b:]

現在你的功能簡單得多。注意，沒有必要回prefix_dict因爲你是操縱在就地：

def add_prefixs(word, prefix_dict): 
    lst = [word[0:n+1] for n in range(len(word))] 
    for b, elem in enumerate(lst): 
     prefix_dict[elem]=lst[b:]

來源

2012-11-27 16:47:13

這工作！非常感謝... – Yarden

這是很容易在指數而不是字母的角度思考，以簡化您的解決方案。通常在Python中，我們循環值，因爲這是我們關心的。在這裏，我們實際上產生的前綴字符串，其中內容並不重要，而是，位置的作用：

def prefixes(seq): 
    for i in range(len(seq)): 
     yield seq[:i+1] 

segments = list(prefixes("birthday")) 
print({segment: segments[start:] for start, segment in enumerate(segments)})

你真正想要的是讓你的話，這是我們可以做的每個前綴在極少數情況下，循環索引是一個有效的選項，因爲這是我們正在嘗試做的。

然後，我們使用dictionary comprehension爲每個片段選擇正確的「兒童」組。

這給了我們（爲了清楚起見，一些添加空格）：

{ 
    'birt': ['birt', 'birth', 'birthd', 'birthda', 'birthday'], 
    'bir': ['bir', 'birt', 'birth', 'birthd', 'birthda', 'birthday'], 
    'birthday': ['birthday'], 
    'bi': ['bi', 'bir', 'birt', 'birth', 'birthd', 'birthda', 'birthday'], 
    'birthda': ['birthda', 'birthday'], 
    'b': ['b', 'bi', 'bir', 'birt', 'birth', 'birthd', 'birthda', 'birthday'], 
    'birthd': ['birthd', 'birthda', 'birthday'], 
    'birth': ['birth', 'birthd', 'birthda', 'birthday'] 
}

如果你不介意一些額外的循環，我們甚至可以簡化它歸結爲：

def prefixes(word): 
    for i in range(len(word)): 
     segment = word[:i+1] 
     yield segment, [segment[:i+1] for i in range(len(segment))] 

print(dict(prefixes("birthday")))

作爲一個附註，prefixes()的另一個實現是：

def prefixes(seq): 
    return prefixes(seq[:-1])+[seq] if seq else []

但是，這是一個遞歸函數因爲Python沒有針對遞歸進行優化，所以這是一個糟糕的做法。它還創建了一個列表而不是生成器，在某些情況下，內存效率較低。

來源

2012-11-27 16:52:53

的Martijn was faster than me，但我有一些補充：

def add_prefixs(word, prefix_dict): 
    lst=[] 
    for n, letter in enumerate(word): 
     if n > 0: 
      lst.append(word[0:n]) 
    lst.append(word) 
    for elem in lst: 
     b=lst.index(elem) 
     prefix_dict[elem]=lst[b:] 
    return prefix_dict

爲什麼把第0項，如果你立即刪除它？

另一個簡化可能是

def add_prefixs(word, prefix_dict): 
    #lst=[word[0:n] for n, letter in enumerate(word) if n > 0] + [word] 
    # why do I think so complicated? Better use 
    lst=[word[0:n+1] for n, letter in enumerate(word)] 
    prefix_dict.update((elem, lst[b:]) for b, elem in enumerate(lst)) 
    return prefix_dict

隨着一類像

class Segments(object): 
    def __init__(self, string, minlength=1): 
     self.string = string 
     self.minlength = minlength 
    def __getitem__(self, index): 
     s = self.string[:self.minlength + index] 
     if len(s) < self.minlength + index: raise IndexError 
     if index >= len(self): raise IndexError # alternatively 
     return s 
    def cut(self, num): 
     return type(self)(self.string, self.minlength + num) 
    def __repr__(self): 
     return repr(list(self)) 
    def __len__(self): 
     return len(self.string) - self.minlength + 1

可以進一步簡化：

def add_prefixes(word, prefix_dict): 
    lst = Segments(word) 
    prefix_dict.update((prefix, lst.cut(n)) for n, prefix in enumerate(lst)) 
    return prefix_dict

嗯。如果我再考慮一次，這不是簡單化。但它避免了基本相同的數據或其中某些部分的許多副本...

來源

2012-11-27 16:53:54 glglgl

我認爲最Python的的做法是：

def add_prefixs(word, prefix_dict): 
    lst = [word[0:n+1] for n in range(len(word))] 
    prefix_dict.update((k, lst[n:]) for n, k in enumerate(lst))

來源

2012-11-27 18:00:25 barracel

使用'的.index（）`上重複字母

回答

相關問題