Gamma in Baum Welsh algorithm and float accuracy

I'm currently trying to implement the Baum Welch algorithm in C, but I ran into the following problem: gamma function:

gamma(i,t) = alpha(i,t) * beta(i,t) / sum over `i` of(alpha(i,t) * beta(i,t))

Unfortunately, for sufficiently large observation sets, alpha drops to 0 at t, and beta quickly drops to 0 when decreasing t, which means that due to rounding, there are never spots where both alpha and beta are nonzero, which makes things pretty problematic .

Is there a way around this problem or am I just trying to increase the accuracy of the values? I am afraid that the problem may appear again if I try this approach, since the alpha and beta drops are about one order of magnitude per observation.

+4
source share
2 answers

You must do these calculations and generally all calculations for probabilistic models in the log space:

lg_gamma(i, t) = (lg_alpha(i, t) + lg_beta(i, t)
                  - logsumexp over i of (lg_alpha(i, t) + lg_beta(i, t)))

where lg_gamma(i, t)is the logarithm gamma(i, t), etc., and logsumexpis the function described here . At the end of the calculation, you can convert to probabilities using exp, if necessary (which is usually only required to display probabilities, but even there, logs may be preferable).

The base of the logarithm is not important if you use the same base everywhere. I prefer the natural logarithm because it logkeeps typing compared to log2:)

+4
source

, , alpha beta . alpha beta, .
, c -, , c :

c(t) = 1 / sum(alpha(t,i)) , i=1... number of states , t=time step ( observation)

, , , c(t) alpha . beta.

HMM, : (rabiner 1989)

-1

Source: https://habr.com/ru/post/1537705/


All Articles