出於某種原因,共發現指數在引理的水平,而不是同義詞集antonymy
關係(見http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=good&i=8&h=00001000000000000000000000000000#c),所以這個問題是Synsets
和Lemmas
是否有許多一對多或一對一一個關係。
在有歧義的單詞的情況下,一個字許多意義,我們有字符串之間-TO- Synset
,例如一個一對多關係
>>> wn.synsets('dog')
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]
在一種含義/概念,多重表達的情況下,我們有Synset
-to字符串(其中字符串是指引理名稱)之間的一對多的關係:
>>> dog = wn.synset('dog.n.1')
>>> dog.definition()
u'a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds'
>>> dog.lemma_names()
[u'dog', u'domestic_dog', u'Canis_familiaris']
注意:到目前爲止,我們正在比較字符串和Synsets
之間的關係而不是Lemmas
和Synsets
。
的「可愛」的事情是,Lemma
和字符串有一個一對一的關係:
>>> wn.synsets('dog')
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]
>>> wn.synsets('dog')[0]
Synset('dog.n.01')
>>> wn.synsets('dog')[0].definition()
u'a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds'
>>> wn.synsets('dog')[0].lemmas()
[Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), Lemma('dog.n.01.Canis_familiaris')]
>>> wn.synsets('dog')[0].lemmas()[0]
Lemma('dog.n.01.dog')
>>> wn.synsets('dog')[0].lemmas()[0].name()
u'dog'
一個Lemma
對象的_name
屬性返回一個unicode字符串,而不是一個列表。從代碼點:https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L202和https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L444
而且看起來引理與Synset有一對一的關係。從https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L220文檔字符串:通過使用相同的名稱::
因此,我們可以做到這一點,不知怎麼知道每個Lemma
對象只打算返回美國1個同義詞集:
>>> wn.synsets('dog')[0].lemmas()[0]
Lemma('dog.n.01.dog')
>>> wn.synsets('dog')[0].lemmas()[0].synset()
Synset('dog.n.01')
假設你正在嘗試做一些情感分析和你需要在WordNet中的每個形容詞的反義詞,你可以很容易地做到這一點接受反義詞的Synsets:
>>> from nltk.corpus import wordnet as wn
>>> all_adj_in_wn = wn.all_synsets(pos='a')
>>> def get_antonyms(ss):
... return set(chain(*[[a.synset() for a in l.antonyms()] for l in ss.lemmas()]))
...
>>> for ss in all_adj_in_wn:
... print ss, ':', get_antonyms(ss)
...
Synset('unable.a.01') : set([Synset('unable.a.01')])
這是棘手COS反義關係是通過引理不是同義集到同義詞集相連。 – alvas
請參閱從http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=good&i=8&h=00001000000000000000000000000000#c,在* * S **表示同義詞集和** **W¯¯字指(即引理) – alvas
嗨Alvas!我其實是試圖找到你的電子郵件,但無法找到它..我怎麼能聯繫你?我記得你們的幫助最上我所有的WORDNET問題在這裏:) – modarwish