2015-05-12 42 views
3

我在Python 3.4.3上創建了一個代碼。我有一個語言程序。這部分代碼必須刪除下一個單詞,前提是它是前一個單詞的同義詞。首先,我們必須爲每個單詞創建一個同義詞列表。然後我們將所有的列表轉換成集合。但最終,我們必須比較我們的列表以檢查它們是否具有相同的同義詞。我不知道如何比較它們。如果接下來有一個詞的同義詞,我們只需要保留一個詞。如何刪除同義詞?

from nltk.corpus import wordnet 
text = ['','',''] 
text4 = [] 

def f4(text): 
    global text4 

    synonyms = [] 
    for sentence in text: 
     d = ' ' 
     sentence = sentence.split(d) 
     for word in sentence: 
      syn = [] 
      for syn in wordnet.synsets(word): 
       for lemma in syn.lemmas(): 
        syn.append(lemma.name()) 
      synonyms.append(syn) 

    synonyms2 = [] 
    for x in synonyms: 
     x = set(x) 
     synonyms2.append(x) 
+0

這個問題可能會更加只要你能找到涉及的一般情況,對他人有用。嘗試找出你正面臨的單一(與語法有關的)問題,並相應地編輯你的問題! –

+0

我建議先從僞代碼開始。 – sevenforce

+0

哪部分代碼是問題? –

回答

1

我的代碼有權刪除下一個字如果是前一個單詞的同義詞。

我會建議一個不同的算法。這裏有一個例子:

text = 'run race stroll rush nice lovely mean kind' # example text 
synonyms = [] # contains a list of synonym lists 
synonyms.append(['run', 'race', 'rush']) # run synonyms 
synonyms.append(['nice', 'lovely', 'kind']) # nice synonyms 

def in_synonyms(list_of_synonym_lists, word): 
    """ Returns index of synonym list the word is in; -1 if isn't found. """ 
    for index, synonym_list in enumerate(list_of_synonym_lists): 
     if word in synonym_list: 
      return index 
    return -1 

# The algorithm 
split_text = text.split() 
index = 1 
while index < len(split_text): 
    if in_synonyms(synonyms, split_text[index]) != -1: # if word is in any synonyms list 
     if in_synonyms(synonyms, split_text[index]) == in_synonyms(synonyms, split_text[index-1]): 
      # if word before is in the same synonyms list as current we delete the current 
      # one and start over again 
      del(split_text[index]) 
      index = 1 # restart the algorithm 
     else: 
      index += 1 # continue on forward 
text = ' '.join(split_text) 

此代碼:

  • 創建同義詞列表列出
  • 遍歷文本
    • 的話。如果前一個詞是同義詞一樣名單目前的一個,我們刪除當前的並重新啓動算法
    • 否則我們繼續前進

我還沒有測試過,但我希望你能明白。

1

如果你想過濾的話哪些是重複,同義反復,的前面詞的同義詞:

filtered = [] 
previous_word = None 
for word in sentence.split(' '): 
    if previous_word and synonymous(word, previous_word): 
     continue 
    else: 
     filtered.append(word) 
     previous_word = word 

' '.join(filtered) 

你可以在列表理解這樣做:

words = sentence.split(' ') 
new_sentence = ' '.join(word for word, previous in zip(words, [None] + words) 
         if not synonymous(word, previous))