I am having problems optimizing the multidimensional normal logarithmic likelihood in R. If anyone has a good solution for this, please let me know. In particular, I cannot show that the variance-covariance matrix is positive definite and the parameters are in a reasonable range.
Let me introduce the problem more fully. I am trying to simultaneously solve these two regression equations using MLE:
$$ y_1 = \ beta_1 + \ beta_2 x_1 + \ beta_3 x_2 \\ y_2 = \ beta_4 + \ beta_3 x_1 + \ beta_5 x_2 $$
The fact that $ \ beta_3 $ is in both equations is not an error. I am trying to solve this with MLE, maximizing the probability of a multidimensional normal distribution for $ Y = (y_1, y_2) ^ \ top $, where the average is parameterized, as indicated above, in the regression equations.
I attached the logarithmic likelihood function, since I believe that this should be where I restrict the variance covariance matrix to a positive-definite one, recreating it from the required positive eigenvalues and choles decomposition.
mvrestricted_ll <- function(par, Y, X) {
n <- nrow(X)
nbetas <- (2 + 3 * (ncol(Y) - 1))
beta <- par[1:nbetas]
eigvals <- exp(par[(nbetas + 1):(nbetas + ncol(Y))]) # constrain to be positive
chole <- par[(nbetas + ncol(Y) + 1):(nbetas + ncol(Y) + ncol(Y)*(ncol(Y)+1)/2)]
L <- diag(ncol(Y))
L[lower.tri(L, diag=T)] <- chole
Sigma <- diag(eigvals) + tcrossprod(L)
mu <- cbind(beta[1] + beta[2]*X[,1] + beta[3]*X[,2],
beta[4] + beta[3]*X[,1] + beta[5]*X[,2])
yminmu <- Y - mu
nlogs <- n * log(det(Sigma))
invSigma <- solve(Sigma)
meat <- yminmu %*% tcrossprod(invSigma, yminmu)
return(- nlogs - sum(diag(meat)))
}
n <- 1000
p <- 2
set.seed(20160201)
X <- matrix(rnorm(n*p), nrow = n)
set.seed(20160201)
Y <- matrix(rnorm(n*p), nrow = n)
initpars <- c(rep(0, (2 + 3 * (ncol(Y) - 1)) + ncol(Y) + ncol(Y)*(ncol(Y)+1)/2))
optim(par = initpars, fn = mvrestricted_ll, X=X, Y=Y, method = "BFGS")
optim(par = initpars, fn = mvrestricted_ll, X=X, Y=Y)
Any help would be greatly appreciated.
EDIT: I should point out that just letting Sigma be a vector in the parameters, and then returning a very large value when it is not positive definite, doesn't work either.