給定'the'的'開始'的概率是多少？

-1

Using an NLTK Conditional Frequency Distribution and the nltk.bigrams function, train a bigram model on the Genesis: 

text = nltk.corpus.genesis.words('english-kjv.txt') 
bigrams = nltk.bigrams(text) 
cfd = nltk.ConditionalFreqDist(bigrams) 
Answer the following questions 

What is the Probability of ‘begining’ given ‘the’? 
What is the probability of ‘the’?

注意：作爲答案給出的概率必須是從該語料庫可計算的概率。給定'the'的'開始'的概率是多少？

嗨，可以幫助我嗎？這是在nltk書。當我得到它時，我得到了78％，這是沒有意義的。我試圖在Python中計算。

來源

2014-05-07 user3563184

零，這不是如何「開始」拼寫:) – hobbs

我的天才天才！那麼那麼呢？我仍然得到78 – user3563184

有幾分probability of 'beginning' intersect 'the'

p('beginning','the')

和probability of 'beginning' given 'the'之間的差異：

p('beginning'|'the') = p('beginning','the')/p('the')

嘗試：

from collections import Counter 

import nltk 

text = nltk.corpus.genesis.words('english-kjv.txt') 
bigrams = nltk.bigrams(text) 
cfd_bigrams = Counter(bigrams) 
cfd_unigrams = Counter(list(text)) 

print "p('said','unto') =", cfd_bigrams[u'said', u'unto']/float(sum(cfd_bigrams.values())) 

print "p('said'|'unto') =", (cfd_bigrams[u'said', u'unto']/float(sum(cfd_bigrams.values())))/cfd_unigrams[u'unto'] 

print "p('beginning','the') =", cfd_bigrams[u'beginning', u'the']

[出]：

p('said','unto') = 0.00397649844738 
p('said'|'unto') = 6.73982787691e-06 
p('beginning','the') = 0

來源

2014-05-17 21:32:59 alvas

給定'the'的'開始'的概率是多少？

回答

相關問題