I am trying to implement a kernel density estimate. However, my code does not give an answer. It is also written in julia, but the code should be clear.
Here is the algorithm:

Where

Thus, the algorithm checks whether the distance between x and the observation X_i is smaller by using a certain constant factor (bin width). If so, he assigns this value 0.5 / (n * h), where n = # observations.
Here is my implementation:
function kernelDensity(data)
|
|
| #@param x: Current x value
|
| #@param width: binwidth
|
| #x(current) to x(observation) weighted by the binwidth
|
|
| function uniformKernel(x, observation, width)
| | u = ( x - observation ) / width
| | abs ( u ) <= 1 ? 1 : 0
| end
|
| #number of observations in the data set
| n = length(data)
|
|
| h = 0.1
|
| #vector that stored the pdf
| res = zeros( Real, n )
|
|
| counter = 0
|
| #lower and upper limit of the x axis
| start = floor(minimum(data))
| stop = ceil (maximum(data))
|
| #main loop
|
| #equally spaced intervalls
| for x in linspace(start, stop, n)
| | counter += 1
| | for observation in data
| | |
| | |
| | | #returns 1 and mult by 0.5 because the
| | |
| | | #either positive or negative
| | | res[counter] += 0.5 * uniformKernel(x, observation, h)
| | end
| |
| | res[counter] /= n * h
| end
|
| res
end
#run function
#@rand: generates 10 uniform random numbers between 0 and 1
kernelDensity(rand(10))
and this returns:
> 0.0
> 1.5
> 2.5
> 1.0
> 1.5
> 1.0
> 0.0
> 0.5
> 0.5
> 0.0
the sum of which is equal to: 8.5 (cumulative distribution function should be 1.)
So there are two errors:
- Values do not scale properly. Each number should be about one tenth of their current values. In fact, if the number of observations increases by 10 ^ nn = 1, 2, ... then cdf also increases by 10 ^ n
For example:
> kernelDensity(rand(1000))
> 953.53
- 10 ( , ). : . 5% .
, 1:1, , .