2013-12-15 64 views
0

有點python newb在這裏試圖找出爲什麼我的代碼沒有給出預期的結果。首先代碼:python itertools排列通過索引比較縮小結果,不按預期工作

from itertools import permutations 

word_list = ['eggs', ',', 'bacon', ',', 'chicken', ',', 'cheese', 'and', 'tomatoes'] 
grammar_list = ['NOUN', ',', 'NOUN', ',', 'NOUN', ',', 'NOUN', 'AND', 'NOUN'] 

def permute_nouns(): 
    permuted_list = [] 
    comma_AND_indices = [index for index, p in enumerate(grammar_list) if p == "," or p == "AND"] 
    # so 'comma_AND_indices' = [1, 3, 5, 7] 

    for perm in permutations(word_list): 
     observed_comma_AND_indices = [index for index, p in enumerate(perm) if p == "," or p == "and"] 
     if comma_AND_indices == observed_comma_AND_indices: 
      # what goes wrong here? not matches from list compare above still get appended below. 
      permuted_list.append(perm) 

    print permuted_list 

permute_nouns() 

在此功能,我使用迭代工具排列方法來創建word_list的排列。但是,我不想要所有的排列組合。我只想要逗號和單詞'and'在word_list中保持其原始位置/索引的排列,並將它們追加到permuted_list

我使用代碼行if comma_AND_indices == observed_comma_AND_indices:篩選出我不想要的排列,但它不起作用,我不明白爲什麼。在打印出permuted_list時,我發現逗號和'和'不被保留,但所有的排列都被追加。

(你可能會奇怪,爲什麼在功能使用grammar_list麻煩,但這裏的代碼是一個稍微大一點的腳本,其中grammar_list發揮其作用的一部分)

任何幫助把光這一讚賞。

達倫

編輯:這裏是正在打印我的樣本:

[('eggs', ',', 'bacon', ',', 'chicken', ',', 'cheese', 'and', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'chicken', ',', 'tomatoes', 'and', 'cheese'), ('eggs', ',', 'bacon', ',', 'chicken', 'and', 'cheese', ',', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'chicken', 'and', 'tomatoes', ',', 'cheese'), ('eggs', ',', 'bacon', ',', 'cheese', ',', 'chicken', 'and', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'cheese', ',', 'tomatoes', 'and', 'chicken'), ('eggs', ',', 'bacon', ',', 'cheese', 'and', 'chicken', ',', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'cheese', 'and', 'tomatoes', ',', 'chicken'), ('eggs', ',', 'bacon', ',', 'tomatoes', ',', 'chicken', 'and', 'cheese'), ('eggs', ',', 'bacon', ',', 'tomatoes', ',', 'cheese', 'and', 'chicken'), ('eggs', ',', 'bacon', ',', 'tomatoes', 'and', 'chicken', ',', 'cheese'), ('eggs', ',', 'bacon', ',', 'tomatoes', 'and', 'cheese', ',', 'chicken'), ('eggs', ',', 'bacon', ',', 'chicken', ',', 'cheese', 'and', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'chicken', ',', 'tomatoes', 'and', 'cheese'), ('eggs', ',', 'bacon', ',', 'chicken', 'and', 'cheese', ',', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'chicken', 'and', 'tomatoes', ',', 'cheese'), ('eggs', ',', 'bacon', ',', 'cheese', ',', 'chicken', 'and', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'cheese', ',', 'tomatoes', 'and', 'chicken'), ('eggs', ',', 'bacon', ',', 'cheese', 'and', 'chicken', ',', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'cheese', 'and', 'tomatoes', ',', 'chicken'), ('eggs', ',', 'bacon', ',', 'tomatoes', ',', 'chicken', 'and', 'cheese'), ('eggs', ',', 'bacon', ',', 'tomatoes', ',', 'cheese', 'and', 'chicken'), ('eggs', ',', 'bacon', ',', 'tomatoes', 'and', 'chicken', ',', 'cheese'), ('eggs', ',', 'bacon', ',', 'tomatoes', 'and', 'cheese', ',', 'chicken'), ('eggs', ',', 'bacon', 'and', 'chicken', ',', 'cheese', ',', 'tomatoes'), ('eggs', ',', 'bacon', 'and', 'chicken', ',', 'tomatoes', ',', 'cheese'), ('eggs', ',', 'bacon', 'and', 'chicken', ',', 'cheese', ',', 'tomatoes'), ('eggs', ',', 'bacon', 'and', 'chicken', ',', 'tomatoes', ',', 'cheese'), ('eggs', ',', 'bacon', 'and', 'cheese', ',', 'chicken', ',', 'tomatoes'), ('eggs', ',', 'bacon', 'and', 'cheese', ',', 'tomatoes', ',', 'chicken'), ('eggs', ',', 'bacon', 'and', 'cheese', ',', 'chicken', ',', 'tomatoes'), ('eggs', ',', 'bacon', 'and', 'cheese', ',', 'tomatoes', ',', 'chicken'), ('eggs', ',', 'bacon', 'and', 'tomatoes', ',', 'chicken', ',', 'cheese'), ('eggs', ',', 'bacon', 'and', 'tomatoes', ',', 'cheese', ',', 'chicken'), ('eggs', ',', 'bacon', 'and', 'tomatoes', ',', 'chicken', ',', 'cheese'), ('eggs', ',', 'bacon', 'and', 'tomatoes', ',', 'cheese', ',', 'chicken'), ('eggs', ',', 'chicken', ',', 'bacon', ',', 'cheese', 'and', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'bacon', ',', 'tomatoes', 'and', 'cheese'), ('eggs', ',', 'chicken', ',', 'bacon', 'and', 'cheese', ',', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'bacon', 'and', 'tomatoes', ',', 'cheese'), ('eggs', ',', 'chicken', ',', 'cheese', ',', 'bacon', 'and', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'cheese', ',', 'tomatoes', 'and', 'bacon'), ('eggs', ',', 'chicken', ',', 'cheese', 'and', 'bacon', ',', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'cheese', 'and', 'tomatoes', ',', 'bacon'), ('eggs', ',', 'chicken', ',', 'tomatoes', ',', 'bacon', 'and', 'cheese'), ('eggs', ',', 'chicken', ',', 'tomatoes', ',', 'cheese', 'and', 'bacon'), ('eggs', ',', 'chicken', ',', 'tomatoes', 'and', 'bacon', ',', 'cheese'), ('eggs', ',', 'chicken', ',', 'tomatoes', 'and', 'cheese', ',', 'bacon'), ('eggs', ',', 'chicken', ',', 'bacon', ',', 'cheese', 'and', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'bacon', ',', 'tomatoes', 'and', 'cheese'), ('eggs', ',', 'chicken', ',', 'bacon', 'and', 'cheese', ',', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'bacon', 'and', 'tomatoes', ',', 'cheese'), ('eggs', ',', 'chicken', ',', 'cheese', ',', 'bacon', 'and', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'cheese', ',', 'tomatoes', 'and', 'bacon'), ('eggs', ',', 'chicken', ',', 'cheese', 'and', 'bacon', ',', 'tomatoes'), 
+1

爲什麼不去掉逗號和'AND's。它會更有效率。 – thefourtheye

+0

我無法重現您的問題;我看到符合您的標準的2880個版本,總排列次數爲362880個。 –

+0

您是否介意發佈預期輸出的一部分... –

回答

1

你的代碼工作得很好,雖然你可能會產生同一列表更快的更簡潔與product()這裏的[','] + 3 + ['and'][w for w in word_list if w not in (',', 'and')]的排列組合產生相同的120 * 24 = 2880個組合。

如果您只希望得到120個結果,那麼您忘記了您未在輸出中測試3個逗號的順序和'and'這個詞;有允許該名單的24個不同的排列:

>>> len(list(permutations([','] * 3 + ['and']))) 
24 

換句話說,只是你是生產句子的24種變化與3個逗號和不同位置的字and名詞的任何給定的排列。

爲了生產不僅僅是名詞的120個組合:

nouns = [w for w in word_list if w not in (',', 'and')] 
grammar = [w for w in word_list if w in (',', 'and')] 
result = [] 
for perm in permutations(nouns): 
    result.append([w for word, g in map(None, perm, grammar) for w in (word, g) if w is not None]) 
+0

謝謝Martijn,那太棒了。 –

1

如果重複沒有關係,你可以只需要使用itertools.product

for words in itertools.product(*(['a'], ['big', 'fat'], ['dog', 'house'])): 
    print(' '.join(words)) 

它打印:

a big dog 
a big house 
a fat dog 
a fat house 

但是既然他們這樣做了,你必須做一些更復雜的事情:

import itertools 
import collections 

grammar = ['NOUN', ',', 'NOUN', ',', 'NOUN', ',', 'NOUN', 'AND', 'NOUN'] 
parts_of_speech = { 
    'NOUN': ['eggs', 'bacon', 'chicken', 'cheese', 'tomatoes'], 
    'AND': ['and'], 
    ',': [','] 
} 

def partial_sentences(words, indices, sentence_length): 
    if len(indices) > len(words): 
     orderings = itertools.product(words, repeat=len(indices)) 
    else: 
     orderings = itertools.permutations(words, len(indices)) 

    for words in orderings: 
     sentence = [None] * sentence_length 

     for index, word in zip(indices, words): 
      sentence[index] = word 

     yield sentence 

def pos_stacks(parts_of_speech, grammar): 
    positions = collections.defaultdict(list) 

    for index, pos in enumerate(grammar): 
     positions[pos].append(index) 

    for pos, indices in positions.items(): 
     yield partial_sentences(parts_of_speech[pos], indices, len(grammar)) 

for result in itertools.product(*pos_stacks(parts_of_speech, grammar)): 
    sentence = [next(itertools.ifilter(bool, words)) for words in zip(*result)] 

    print(sentence) 

它基本上創造了所有可能的單詞在其正確位置的順序,循環遍歷所有的詞性,並將句子「疊加」在一起。