Calculate the area under the density curve, i.e. Probability

I have a density estimate (using a function density) for my data learningTime(see the figure below), and I need to find the probability Pr(learningTime > c), i.e. the area under the density curve from a given number c(red vertical line) to the end of the curve. Any idea?

enter image description here

+4
source share
1 answer

This is not hard work. Suppose we have some observable data x(yours TMESAL$learningTime), and as a reproducible example, I simply generate 1000 standard standard random samples:

set.seed(0)
x <- rnorm(1000)

:

d <- density.default(x, n = 512, cut = 3)
str(d)
#    List of 7
# $ x        : num [1:512] -3.91 -3.9 -3.88 -3.87 -3.85 ...
# $ y        : num [1:512] 2.23e-05 2.74e-05 3.35e-05 4.07e-05 4.93e-05 ...
# ... truncated ...

d$x d$y:

xx <- d$x  ## 512 evenly spaced points on [min(x) - 3 * d$bw, max(x) + 3 * d$bw]
dx <- xx[2L] - xx[1L]  ## spacing / bin size
yy <- d$y  ## 512 density values for `xx`
plot(xx, yy, type = "l")  ## plot density curve (or use `plot(d)`)

. , :

C <- sum(yy) * dx  ## sum(yy * dx)
# [1] 1.000976

, 1 ( ). " ".

, , x0 = 1 , .. [x0, Inf],

p.unscaled <- sum(yy[xx >= x0]) * dx
# [1] 0.1691366

, C:

p.scaled <- p.unscaled / C
# [1] 0.1689718

x , :

pnorm(x0, lower.tail = FALSE)
# [1] 0.1586553

.

+6

Source: https://habr.com/ru/post/1662175/


All Articles