2013-03-31 97 views
0

我必須計算list1和list2項目之間的synset相似度。我想只保留list1中單詞的最大synset相似度值。我該怎麼做呢?我希望我的輸出是如何計算NLTK中synsets之間的最大相似度? -Python

apple.n.01, pear.n.01: 0.909090909091 
honey.n.01, pear.n.01: 0.333333333333 

mycode的

>>> from nltk.corpus import wordnet 
>>> import itertools as IT 
>>> list1 = ["apple", "honey"] 
>>> list2 = ["pear", "shell", "movie", "fire", "tree", "candle"] 
>>> for word1, word2 in IT.product(list1, list2): 
    wordFromList1 = wordnet.synsets(word1)[0] 
    wordFromList2 = wordnet.synsets(word2)[0] 
    s = wordFromList1.wup_similarity(wordFromList2) 
    print('{w1}, {w2}: {s}'.format(w1 = wordFromList1.name,w2 = wordFromList2.name,s = wordFromList1.wup_similarity(wordFromList2))) 


apple.n.01, pear.n.01: 0.909090909091 
apple.n.01, shell.n.01: 0.4 
apple.n.01, movie.n.01: 0.421052631579 
apple.n.01, fire.n.01: 0.142857142857 
apple.n.01, tree.n.01: 0.380952380952 
apple.n.01, candle.n.01: 0.380952380952 
honey.n.01, pear.n.01: 0.333333333333 
honey.n.01, shell.n.01: 0.210526315789 
honey.n.01, movie.n.01: 0.222222222222 
honey.n.01, fire.n.01: 0.125 
honey.n.01, tree.n.01: 0.2 
honey.n.01, candle.n.01: 0.2 
+0

'家庭作業'標記。 – alvas

回答

1

試試這個:

from nltk.corpus import wordnet 
import itertools as IT 
list1 = ["apple", "honey"] 
list2 = ["pear", "shell", "movie", "fire", "tree", "candle"] 
def f(word1, word2): 
    wordFromList1 = wordnet.synsets(word1)[0] 
    wordFromList2 = wordnet.synsets(word2)[0] 
    s = wordFromList1.wup_similarity(wordFromList2) 
    return(wordFromList1.name, wordFromList2.name, wordFromList1.wup_similarity(wordFromList2)) 

for word1 in list1: 
    similarities=(f(word1,word2) for word2 in list2) 
    print(max(similarities, key=lambda x: x[2])) 

它創建了一個發電機,它返回的話和他們的相似之處。然後打印第三個元素中具有最大值的元組。

相關問題