2017-07-22 13 views
1

我正在嘗試查找單詞的同義詞。這裏是我的代碼:如何在不重複同義詞和pos_tag的情況下在NLTK synsets中多次打印單詞的所有詞形名?

from nltk.corpus import wordnet as wn 
from nltk import pos_tag 

def getSynonyms(word1): 
    synonymList1 = [] 
    for data1 in word1: 
     wordnetSynset1 = wn.synsets(data1) 
     tempList1=[] 
     for synset1 in wordnetSynset1: 
      synLemmas = synset1.lemma_names() 
      for i in xrange(len(synLemmas)): 
       word = synLemmas[i].replace('_',' ') 
       tempList1.append(pos_tag(word.split())) 
     synonymList1.append(tempList1) 
    return synonymList1 

word1 = ['study'] 

syn1 = getSynonyms(word1) 

print syn1 

和這裏的輸出:

[[[(u'survey', 'NN')], [(u'study', 'NN')], [(u'study', 'NN')], [(u'work', 'NN')], [(u'report', 'NN')], [(u'study', 'NN')], [(u'written', 'VBN'), (u'report', 'NN')], [(u'study', 'NN')], [(u'study', 'NN')], [(u'discipline', 'NN')], [(u'subject', 'NN')], [(u'subject', 'JJ'), (u'area', 'NN')], [(u'subject', 'JJ'), (u'field', 'NN')], [(u'field', 'NN')], [(u'field', 'NN'), (u'of', 'IN'), (u'study', 'NN')], [(u'study', 'NN')], [(u'bailiwick', 'NN')], [(u'sketch', 'NN')], [(u'study', 'NN')], [(u'cogitation', 'NN')], [(u'study', 'NN')], [(u'study', 'NN')], [(u'study', 'NN')], [(u'analyze', 'NN')], [(u'analyse', 'NN')], [(u'study', 'NN')], [(u'examine', 'NN')], [(u'canvass', 'NN')], [(u'canvas', 'NN')], [(u'study', 'NN')], [(u'study', 'NN')], [(u'consider', 'VB')], [(u'learn', 'NN')], [(u'study', 'NN')], [(u'read', 'NN')], [(u'take', 'VB')], [(u'study', 'NN')], [(u'hit', 'VB'), (u'the', 'DT'), (u'books', 'NNS')], [(u'study', 'NN')], [(u'meditate', 'NN')], [(u'contemplate', 'NN')]]] 

我們可以看到,'study','NN'出現不止一次

如何爲沒有repitition每個同義詞只能打印一次?

所以每個同義詞表示只有一個代名詞

回答

1

而不是總是追加到你裏面的for循環列表,在該行tempList1.append(pos_tag(word.split()))。你應該檢查你試圖添加的元素是否已經在列表中。有一個簡單的if語句檢查應該做到這一點。

if pos_tag(word.split()) not in tempList1: 
    tempList1.append(pos_tag(word.split())) 

這是一個元素不會被添加兩次。

+0

對不起,但它仍然重複 – sang

+0

哇!謝謝先生。 anon – sang

+0

很高興我能幫到:) – anon

-1

SYN1 =集(getSynonyms(WORD1))

使返回的列表轉換爲一組將刪除重複。我在這裏假設訂單並不重要,因爲訂單沒有明確的訂單。

+0

這不會起作用getSynonyms()返回列表的列表。並且列表類型不可排除 – anon

+0

是的,這是錯誤的,我試過 – sang

+0

'syn1 = set(syn [0] for syn in getSynonyms(word1))''將從列表級別剝離。但是OP應該在早期階段修復他們的代碼。 (另外,'wordnet'沒有'synsets()'方法,所以問題中的代碼不會運行。) – alexis

相關問題