我正在嘗試實現核密度估計。然而,我的代碼並沒有提供它應該的答案。它也寫在茱莉亞,但代碼應該是自我解釋。核密度估計julia
這裏是算法:
其中
所以算法測試是否x和觀察X_I由一些常數因數(binwidth)加權之間的距離少於一個。如果是這樣,它將0.5 /(n * h)分配給該值,其中n =觀測值的#個。
這是我實現:
#Kernel density function.
#Purpose: estimate the probability density function (pdf)
#of given observations
#@param data: observations for which the pdf should be estimated
#@return: returns an array with the estimated densities
function kernelDensity(data)
|
| #Uniform kernel function.
| #@param x: Current x value
| #@param X_i: x value of observation i
| #@param width: binwidth
| #@return: Returns 1 if the absolute distance from
| #x(current) to x(observation) weighted by the binwidth
| #is less then 1. Else it returns 0.
|
| function uniformKernel(x, observation, width)
| | u = (x - observation)/width
| | abs (u) <= 1 ? 1 : 0
| end
|
| #number of observations in the data set
| n = length(data)
|
| #binwidth (set arbitraily to 0.1
| h = 0.1
|
| #vector that stored the pdf
| res = zeros(Real, n)
|
| #counter variable for the loop
| counter = 0
|
| #lower and upper limit of the x axis
| start = floor(minimum(data))
| stop = ceil (maximum(data))
|
| #main loop
| #@linspace: divides the space from start to stop in n
| #equally spaced intervalls
| for x in linspace(start, stop, n)
| | counter += 1
| | for observation in data
| | |
| | | #count all observations for which the kernel
| | | #returns 1 and mult by 0.5 because the
| | | #kernel computed the absolute difference which can be
| | | #either positive or negative
| | | res[counter] += 0.5 * uniformKernel(x, observation, h)
| | end
| | #devide by n times h
| | res[counter] /= n * h
| end
| #return results
| res
end
#run function
#@rand: generates 10 uniform random numbers between 0 and 1
kernelDensity(rand(10))
,這被返回:
> 0.0
> 1.5
> 2.5
> 1.0
> 1.5
> 1.0
> 0.0
> 0.5
> 0.5
> 0.0
其總和爲:(累計發佈包功能應該是1)8.5
所以有兩個錯誤:
- 這些值未正確縮放。每個數字應該是其當前值的十分之一。事實上,如果觀察由增加數量10^NN = 1,2,...則CDF也由10^N
例如增加:
> kernelDensity(rand(1000))
> 953.53
- 它們不總計爲10(或者如果它不是縮放誤差,則爲1)。隨着樣本量的增加,誤差會變得更加明顯:未包括5%的觀測值。
我相信我實現了公式1:1,因此我真的不明白錯誤在哪裏。
謝謝你,代碼和庫。沒有找到它。 – Vincent