R中的多項樸素貝葉斯分類器

我正在重複提問（同名）Multinomial Naive Bayes Classifier。這個問題似乎已經接受了我認爲是錯誤的答案，或者我想要更多的解釋，因爲我還是不明白。R中的多項樸素貝葉斯分類器

到目前爲止，我在R中看到的每個樸素貝葉斯分類器（包括bnlearn和klaR）都假定這些特徵具有高斯可能性。

在R中使用多項式可能性（類似於scikit-learn's MultinomialNB）中是否存在樸素貝葉斯分類器的實現？

特別是 - 如果事實證明在這兩個模塊中都有調用naive.bayes的方法，所以可能性是用多項分佈來估計的 - 我真的很感激這樣做的一個例子。我搜索了一些例子，但沒有找到任何例子。例如：這是什麼usekernal參數是在klaR.NaiveBayes？

來源

2014-05-22 gabe

第一個例子在'bnlearn'鏈接（learning.test）具有離散變量的底部。要查看條件概率表，請使用'bn.fit（bn，learning.test）' – user20650

感謝user20650。我看到naive.bayes可以處理離散或繼續的數據。我的問題是：特徵估計的可能性如何？它在文檔中說它假定它們是高斯的。有沒有辦法改變這個？ – gabe

我沒有看過預測是如何計算的，但我預計他們是使用CPT計算的 - 這是多項分佈的mle。我已經加了一個小的前 - 也許有幫助 – user20650

我不知道是什麼算法naive.bayes車型predict方法調用，但你可以從自己的條件概率表計算的預測（MLE估計）

# You may need to get dependencies of gRain from here 
# source("http://bioconductor.org/biocLite.R") 
# biocLite("RBGL") 

    library(bnlearn) 
    library(gRain)

使用的第一個例子來自naive.bayes幫助頁面

data(learning.test) 

    # fit model 
    bn <- naive.bayes(learning.test, "A") 

    # look at cpt's 
    fit <- bn.fit(bn, learning.test)  

    # check that the cpt's (proportions) are the mle of the multinomial dist. 
    # Node A: 
    all.equal(prop.table(table(learning.test$A)), fit$A$prob) 
    # Node B: 
    all.equal(prop.table(table(learning.test$B, learning.test$A),2), fit$B$prob) 


    # look at predictions - include probabilities 
    pred <- predict(bn, learning.test, prob=TRUE) 
    pr <- data.frame(t(attributes(pred)$prob)) 
    pr <- cbind(pred, pr) 

    head(pr, 2) 

# preds   a   b   c 
# 1  c 0.29990442 0.33609392 0.36400165 
# 2  a 0.80321241 0.17406706 0.02272053

從中華映管通過運行查詢

計算預測概率 - 用「糧食」

# query using junction tree- algorithm 
    jj <- compile(as.grain(fit)) 

    # Get ptredicted probs for first observation 
    net1 <- setEvidence(jj, nodes=c("B", "C", "D", "E", "F"), 
             states=c("c", "b", "a", "b", "b")) 

    querygrain(net1, nodes="A", type="marginal") 

# $A 
# A 
#  a   b   c 
# 0.3001765 0.3368022 0.3630213 

    # Get ptredicted probs for secondobservation 
    net2 <- setEvidence(jj, nodes=c("B", "C", "D", "E", "F"), 
             states=c("a", "c", "a", "b", "b")) 

    querygrain(net2, nodes="A", type="marginal") 

# $A 
# A 
#   a   b   c 
# 0.80311043 0.17425364 0.02263593

所以這些概率是相當接近你從bnlearn得到什麼，並使用MLE的計算，

來源

2014-05-25 20:46:44 user20650

謝謝。我以前也一定錯過了這一點 - 但適合描述了「節點A的參數（多項分佈）」。 – gabe

R中的多項樸素貝葉斯分類器

回答

相關問題