0
我想用NLTK和wordnet來理解兩個單詞之間的語義關係。就像我輸入「員工」和「服務員」一樣,它會返回顯示員工比服務員更普遍的東西。或者對於「員工」和「工作人員」,它返回平等。有誰知道這是怎麼做到的嗎?如何確定使用NLTK的語義層次結構/關係?
我想用NLTK和wordnet來理解兩個單詞之間的語義關係。就像我輸入「員工」和「服務員」一樣,它會返回顯示員工比服務員更普遍的東西。或者對於「員工」和「工作人員」,它返回平等。有誰知道這是怎麼做到的嗎?如何確定使用NLTK的語義層次結構/關係?
首先,您必須解決將單詞轉化爲引詞並進入Synsets的問題,即如何識別單詞中的synset?
word => lemma => lemma.pos.sense => synset
Waiters => waiter => 'waiter.n.01' => wn.Synset('waiter.n.01')
所以我們可以說你已經解決了上述問題,並來到位於waiter
最右邊的表示,那麼你可以繼續比較同義詞集。請注意,一個詞可以有很多同義詞
from nltk.corpus import wordnet as wn
waiter = wn.Synset('waiter.n.01')
employee = wn.Synset('employee.n.01')
all_hyponyms_of_waiter = list(set([w.replace("_"," ") for s in waiter.closure(lambda s:s.hyponyms()) for w in s.lemma_names]))
all_hyponyms_of_employee = list(set([w.replace("_"," ") for s in employee.closure(lambda s:s.hyponyms()) for w in s.lemma_names]))
if 'waiter' in all_hyponyms_of_employee:
print 'employee more general than waiter'
elif 'employee' in all_hyponyms_of_waiter:
print 'waiter more general than employee'
else:
print "The SUMO ontology used in wordnet just doesn't have employee or waiter under the same tree"
你試過了什麼? – Anthon 2013-03-25 21:54:07