要知道哪個字有相同/相似的POS標籤,你可以使用慣用的
>>> from nltk.tag import pos_tag
>>> sent = "dog is barking at tree"
>>> [i for i in pos_tag(sent.split()) if i[1] == "NN"]
[('dog', 'NN'), ('tree', 'NN')]
然後得到的可能同義詞集爲一個字,簡單地做:
>>> from nltk.corpus import wordnet as wn
>>> wn.synsets('dog')
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]
最有可能您正在尋找的解決方案是:
>>> from nltk.corpus import wordnet as wn
>>> from nltk.tag import pos_tag
>>> sent = "dog is barking at tree"
>>> for i in [i[0] for i in pos_tag(sent.split()) if i[1].lower()[0] == 'n']:
... print wn.synsets(i); print
...
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]
[Synset('tree.n.01'), Synset('tree.n.02'), Synset('tree.n.03'), Synset('corner.v.02'), Synset('tree.v.02'), Synset('tree.v.03'), Synset('tree.v.04')]
你會發布你的嘗試嗎? – That1Guy
我正在使用我想要的形式「xyz#n#01」(僅舉例)的相關性部分。因爲我想根據感官數量來標記特定的單詞作爲多義詞,所以我問了上面的問題,我正在嘗試NLTK書中的很多東西。 – user3189037