我知道如何使用NLTK獲取bigram和trigram搭配,並將它們應用於我自己的語料庫。代碼如下。特定詞的NLTK搭配
但我不確定(1)如何獲得特定單詞的搭配? (2)NLTK是否具有基於對數似然比的搭配度量?
import nltk
from nltk.collocations import *
from nltk.tokenize import word_tokenize
text = "this is a foo bar bar black sheep foo bar bar black sheep foo bar bar black sheep shep bar bar black sentence"
trigram_measures = nltk.collocations.TrigramAssocMeasures()
finder = TrigramCollocationFinder.from_words(word_tokenize(text))
for i in finder.score_ngrams(trigram_measures.pmi):
print i