In the case of a simple mix of distributions, you can get an effective sampler using the “hack” that you first mentioned:
This particular case can be hacked by choosing one distribution kit or another sample in advance, but I would like to do it using one distribution kit, especially since I may need to configure the general distribution later.
This is actually a case of a Gibbs sample, which is very common in statistics. It is very flexible, and if you know the number of mixtures that you use, it will probably be difficult to beat. Choose one individual distribution from the entire ensemble to the sample, and then a sample from this conditional distribution. Rinse and repeat.
Here's a simple, non-optimized Haskell implementation for a sampler mixture of a Gaussian Gibbs. This is pretty simple, but you get the idea:
import System.Random import Control.Monad.State type ModeList = [(Double, Double)] -- A list of mean/stdev pairs, for each mode. -- Generate a Gaussian (0, 1) variate. boxMuller :: StdGen -> (Double, StdGen) boxMuller gen = (sqrt (-2 * log u1) * cos (2 * pi * u2), gen'') where (u1, gen') = randomR (0, 1) gen (u2, gen'') = randomR (0, 1) gen' sampler :: ModeList -> State StdGen Double sampler modeInfo = do gen <- get let n = length modeInfo (z0, g0) = boxMuller gen (c, g1) = randomR (0, n - 1) g0 -- Sample from the components. (cmu, csig) = modeInfo !! c put g1 return $ cmu + csig * z0 -- Sample from the conditional distribution.
Here is an example: sampling 100 times from a one-dimensional mixture of two Gaussians. The modes have values x = -3
and x = 2.5
, and each component of the mixture has its own separate dispersion. You can add as many modes as you want.
main = do let gen = mkStdGen 42 modeInfo = [(2.5, 1.0), (-3, 1.5)] samples = (`evalState` gen) . replicateM 100 $ sampler modeInfo print samples
Here's a smoothed plot of the density of these 100 samples (using R and ggplot2):
A more versatile algorithm will be sampling of deviation or importance, and in the case of more complex distributions, you probably want to manually perform the corresponding MCMC procedure. Here is a good introduction to Monte Carlo and the MCMC.