我試了好幾個小時來計算熵,我知道我錯過了一些東西。希望這裏有人能給我一個主意!計算熵
編輯:我想我的公式是錯誤的!
CODE:
info <- function(CLASS.FREQ){
freq.class <- CLASS.FREQ
info <- 0
for(i in 1:length(freq.class)){
if(freq.class[[i]] != 0){ # zero check in class
entropy <- -sum(freq.class[[i]] * log2(freq.class[[i]])) #I calculate the entropy for each class i here
}else{
entropy <- 0
}
info <- info + entropy # sum up entropy from all classes
}
return(info)
}
我希望我的帖子是明確的,因爲它是我第一次張貼在這裏。
這是我的數據集:
buys <- c("no", "no", "yes", "yes", "yes", "no", "yes", "no", "yes", "yes", "yes", "yes", "yes", "no")
credit <- c("fair", "excellent", "fair", "fair", "fair", "excellent", "excellent", "fair", "fair", "fair", "excellent", "excellent", "fair", "excellent")
student <- c("no", "no", "no","no", "yes", "yes", "yes", "no", "yes", "yes", "yes", "no", "yes", "no")
income <- c("high", "high", "high", "medium", "low", "low", "low", "medium", "low", "medium", "medium", "medium", "high", "medium")
age <- c(25, 27, 35, 41, 48, 42, 36, 29, 26, 45, 23, 33, 37, 44) # we change the age from categorical to numeric
具有諷刺意味的是,當然計算越差越接近答案。 – Strawberry 2014-12-02 16:58:51
發佈(a)您認爲正確的公式是很好的,以及(b)您將要提供給此功能的數據類型的示例。使用'dput()'是共享數據的好方法。 – Gregor 2014-12-02 17:01:20
你期望什麼答案?你的代碼運行沒有錯誤,並正確計算香農熵。 – cdeterman 2014-12-02 17:20:34