R: Generate data from probability density distribution

Say I have a simple array with an appropriate probability distribution.

library(stats)    
data <- c(0,0.08,0.15,0.28,0.90)
pdf_of_data <- density(data, from= 0, to=1, bw=0.1)

Is there a way to create another dataset using the same distribution. Since the operation is probabilistic, it should no longer exactly correspond to the initial distribution, but will simply be generated from it.

I really had success in finding a simple solution. Thank!

+7
source share
3 answers

It is best to create an empirical cumulative density function, approximate the inversion, and then transform the input.

The matching expression looks like

random.points <- approx(
  cumsum(pdf_of_data$y)/sum(pdf_of_data$y),
  pdf_of_data$x,
  runif(10000)
)$y

Productivity

hist(random.points, 100)

enter image description here

+6
source

From the examples in the documentation, ?densityyou (almost) get an answer.

, - :

library("stats")    
data <- c(0,0.08,0.15,0.28,0.90)
pdf_of_data <- density(data, from= 0, to=1, bw=0.1)

# From the example.
N <- 1e6
x.new <- rnorm(N, sample(data, size = N, replace = TRUE), pdf_of_data$bw)

# Histogram of the draws with the distribution superimposed.
hist(x.new, freq = FALSE)
lines(pdf_of_data)

Imgur

, . , .

+8

To extract from a curve:

sample(pdf_of_data$x, 1e6, TRUE, pdf_of_data$y)
+3
source

Source: https://habr.com/ru/post/1695130/


All Articles