使用wordnet獲取單詞的最佳同義詞

我已經完成了從wordnet獲取同義詞的代碼，並且它提供了每個單詞的同義詞的完整列表。所以，我希望我的代碼根據句子從同義詞列表中選擇適當的同義詞。使用wordnet獲取單詞的最佳同義詞

例如：句子是：「我是他的哥哥」，我必須根據這句話找出每個單詞的最佳同義詞。

讓我們選擇「較舊」。 Wordnet將給出「老」的同義詞列表：

['老'，'一次'，'前'，'sr。'，'一次'，'erstwhile'，'誠實對上帝' ，'老'，'老'，'過去'，'足夠肯定'，'年長'，'高級'，'老'，'某個時候'，'誠實善良'，'過去'，' ]

從列表中最好的同義詞基於這個句子是'老'，所以它應該被選中。

我該怎麼做？

代碼獲取同義詞：

from nltk.tokenize import word_tokenize 
from nltk.tag import pos_tag 
from nltk.corpus import wordnet as wn 

def tag(sentence): 
words = word_tokenize(sentence) 
words = pos_tag(words) 
return words 

def paraphraseable(tag): 
return tag.startswith('NN') or tag == 'VB' or tag.startswith('JJ') 

def pos(tag): 
if tag.startswith('NN'): 
    return wn.NOUN 
elif tag.startswith('V'): 
    return wn.VERB 

def synonyms(word, tag): 
    lemma_lists = [ss.lemmas() for ss in wn.synsets(word, pos(tag))] 
    lemmas = [lemma.name() for lemma in sum(lemma_lists, [])] 
    return set(lemmas) 

def synonymIfExists(sentence): 
for (word, t) in tag(sentence): 
    if paraphraseable(t): 
    syns = synonyms(word, t) 
    if syns: 
    if len(syns) > 1: 
     yield [word, list(syns)] 
     continue 
    yield [word, []] 

def paraphrase(sentence): 
return [x for x in synonymIfExists(sentence)] 
get=[] 
get=paraphrase("I am his older brother") 
print("paraphrase",get)

來源

2017-05-25 anashamidkh

爲什麼「老人」是最好的？（也就是說，判斷最好的標準是什麼，或者你用什麼算法來決定這一點？）（順便說一下，我認爲「大哥」是「哥哥」的最佳代名詞，但是你甚至沒有在你的名單！） –

同義詞同義詞集列出了發生在自然語言和在特定環境中的頻率無關。爲了探索這兩個缺失的區域，我會更多地使用雙向預測模型，並檢查同義詞集中的哪些單詞出現在要替換的語音的左上下文旁邊。同樣，您可以探索正確的上下文以及和/或更長的上下文。

另一種更簡單的方法是根據足夠大的語料庫中的詞頻向WordNet引入頻率順序。假設將出現在語料庫中的頻率是對同義詞的適當性的正確暗示。

來源

2017-05-25 11:00:11 sophros

使用wordnet獲取單詞的最佳同義詞

回答

相關問題