How to get standard maximum likelihood estimation errors in STAN?

I use the maximum likelihood optimization in Stan, but unfortunately the optimizing() function does not report standard errors:

 > MLb4c <- optimizing(get_stanmodel(fitb4c), data = win.data, init = inits) STAN OPTIMIZATION COMMAND (LBFGS) init = user save_iterations = 1 init_alpha = 0.001 tol_obj = 1e-012 tol_grad = 1e-008 tol_param = 1e-008 tol_rel_obj = 10000 tol_rel_grad = 1e+007 history_size = 5 seed = 292156286 initial log joint probability = -4038.66 Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes 13 -2772.49 9.21091e-005 0.0135987 0.07606 0.9845 15 Optimization terminated normally: Convergence detected: relative gradient magnitude is below tolerance > t2 <- proc.time() > print(t2 - t1) user system elapsed 0.11 0.19 0.74 > > MLb4c $par psi alpha beta 0.9495000 0.4350983 -0.2016895 $value [1] -2772.489 > summary(MLb4c) Length Class Mode par 3 -none- numeric value 1 -none- numeric 

How to get standard estimation errors (or confidence interval - quantiles) and, possibly, p-values?

EDIT: I also kindly recommended @Ben Goodrich:

 > MLb4cH <- optimizing(get_stanmodel(fitb4c), data = win.data, init = inits, hessian = TRUE) > sqrt(diag(solve(-MLb4cH$hessian))) psi alpha beta 0.21138314 0.03251696 0.03270493 

But these "unconditional" standard errors, apparently, are very different from the real ones: there is also a way out of the Bayesian fitting using stan() :

 > print(outb4c, dig = 5) Inference for Stan model: tmp_stan_model. 3 chains, each with iter=500; warmup=250; thin=1; post-warmup draws per chain=250, total post-warmup draws=750. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat alpha 0.43594 0.00127 0.03103 0.37426 0.41578 0.43592 0.45633 0.49915 594 1.00176 beta -0.20262 0.00170 0.03167 -0.26640 -0.22290 -0.20242 -0.18290 -0.13501 345 1.00402 psi 0.94905 0.00047 0.01005 0.92821 0.94308 0.94991 0.95656 0.96632 448 1.00083 lp__ -2776.94451 0.06594 1.15674 -2780.07437 -2777.50643 -2776.67139 -2776.09064 -2775.61263 308 1.01220 
+6
source share
1 answer

You can specify the hessian = TRUE argument to the hessian = TRUE function, which returns Hessian as part of the output list. Thus, you can get estimated standard errors using sqrt(diag(solve(-MLb4c$hessian))) ; however, these standard errors apply to estimates in unlimited space. To get estimated standard errors for parameters in a limited space, you can either use the delta method or draw multiple times from a multidimensional normal distribution, the average vector of which is MLb4c$par and whose variance-covariance solve(-MLb4c$hessian) , convert these yeast to limited space using the constrain_pars function and evaluate the standard deviation of each column.

Here is some R code you could adapt to your business

 # 1: Compile and save a model (make sure to pass the data here) model <- stan(file="model.stan", data=c("N","K","X","y"), chains = 0, iter = 0) # 2: Fit that model fit <- optimizing(object=get_stanmodel(model), as_vector = FALSE, data=c("N","K","X","y"), hessian = TRUE) # 3: Extract the vector theta_hat and the Hessian for the unconstrained parameters theta_hat <- unlist(fit$par) upars <- unconstrain_pars(linear, relist(theta_hat, fit$par)) Hessian <- fit$hessian # 4: Extract the Cholesky decomposition of the (negative) Hessian and invert R <- chol(-Hessian) V <- chol2inv(R) rownames(V) <- colnames(V) <- colnames(Hessian) # 5: Produce a matrix with some specified number of simulation draws from a multinormal SIMS <- 1000 len <- length(theta_hat) unconstrained <- upars + t(chol(V)) %*% matrix(rnorm(SIMS * len), nrow = len, ncol = SIMS) theta_sims <- t(apply(unconstrained, 2, FUN = function(upars) { unlist(constrain_pars(linear, upars)) })) # 6: Produce estimated standard errors for the constrained parameters se <- apply(theta_sims, 2, sd) 
+11
source

Source: https://habr.com/ru/post/978866/


All Articles