Random number generation with a given density function

I want to specify a distribution density function and then collect N random numbers from this distribution in python. How can I do it?

+5
source share
2 answers

In general, you want to have the inverse cumulative probability density function. Once you do this, generating random numbers by distribution is simple:

import random def sample(n): return [ icdf(random.random()) for _ in range(n) ] 

Or if you use NumPy:

 import numpy as np def sample(n): return icdf(np.random.random(n)) 

In both cases, icdf is a function of the inverse cumulative distribution, which takes a value from 0 to 1 and outputs the corresponding value from the distribution.

To illustrate the nature of icdf , we take a simple uniform distribution between the values ​​10 and 12 as an example:

  • The probability distribution function is 0.5 between 10 and 12, zero elsewhere

  • the cumulative distribution function is 0 below 10 (without samples below 10), 1 above 12 (without samples above 12) and grows linearly between values ​​(PDF integral)

  • the inverse cumulative distribution function is determined only between 0 and 1. At 0 it is 10, at 12 it is 1 and it varies linearly between the values

Of course, the hard part is getting the inverse cumulative density function. It really depends on your distribution, sometimes you can have an analytic function, sometimes you can resort to interpolation. Numerical methods can be useful since numerical integration can be used to create a CDF, and interpolation can be used to invert it.

+6
source

This is my function to get one random number distributed according to a given probability density function. I used the Monte Carlo approach. Of course, n random numbers can be called by calling this function n times.

  """ Draws a random number from given probability density function. Parameters ---------- pdf -- the function pointer to a probability density function of form P = pdf(x) interval -- the resulting random number is restricted to this interval pdfmax -- the maximum of the probability density function integers -- boolean, indicating if the result is desired as integer max_iterations -- maximum number of 'tries' to find a combination of random numbers (rand_x, rand_y) located below the function value calc_y = pdf(rand_x). returns a single random number according the pdf distribution. """ def draw_random_number_from_pdf(pdf, interval, pdfmax = 1, integers = False, max_iterations = 10000): for i in range(max_iterations): if integers == True: rand_x = np.random.randint(interval[0], interval[1]) else: rand_x = (interval[1] - interval[0]) * np.random.random(1) + interval[0] #(b - a) * random_sample() + a rand_y = pdfmax * np.random.random(1) calc_y = pdf(rand_x) if(rand_y <= calc_y ): return rand_x raise Exception("Could not find a matching random number within pdf in " + max_iterations + " iterations.") 

In my opinion, this solution works better than other solutions if you do not need to extract a very large number of random variables. Another advantage is that you only need a PDF file and you do not need to calculate CDF, inverse CDF or weight.

0
source

Source: https://habr.com/ru/post/1201016/


All Articles