2016-04-14 95 views
0

我嘗試使用Wordnet作爲thearus,所以我有一個單詞列表,我需要爲每個單詞收集其同義詞。我想這查找wordnet中單詞的同義詞

from nltk.corpus import wordnet as wn 
for i,j in enumerate(wn.synsets('dog')): 
    print (j.lemma_names) 

該代碼給出了下面的輸出

<bound method Synset.lemma_names of Synset('dog.n.01')> 
<bound method Synset.lemma_names of Synset('frump.n.01')> 
<bound method Synset.lemma_names of Synset('dog.n.03')> 
<bound method Synset.lemma_names of Synset('cad.n.01')> 
<bound method Synset.lemma_names of Synset('frank.n.02')> 
<bound method Synset.lemma_names of Synset('pawl.n.01')> 
<bound method Synset.lemma_names of Synset('andiron.n.01')> 
<bound method Synset.lemma_names of Synset('chase.v.01')> 

但我想在列表中只同義詞收集,所以輸出會是這樣

[「穿得邋里邋遢的女人」 ,'cad','frank','pawl','andiron','chase']

+0

如果將最後一行'print(j.lemma_names)'更改爲'print(j.lemma_names())',會發生什麼? – davedwards

回答

0

正如您的輸出所示,lemma_names是一種方法而不是屬性。打擊代碼工作如你預期:

from nltk.corpus import wordnet as wn 
result = [st.lemma_names()[0] for st in wn.synsets('dog')] 
print(result) 

輸出是:

[u'dog', u'frump', u'dog', u'cad', u'frank', u'pawl', u'andiron', u'chase'] 

請注意,在列表中的項目是Unicode字符串的。這就是爲什麼你在輸出中看到領先的或者