Defining a function that computes the covariance matrix of a correlation matrix

I have some problems with matrix conversion and row and column names.

My problem is this:

As an input matrix, I have a (symmetric) <strong> correlation matrix similar to this one:

enter image description here

the correlation vector is specified by the values โ€‹โ€‹of the lower triangular matrix:

enter image description here

Now I want to calculate the dispersion-covariance matrix of these correlations, which are approximately normally distributed using the dispersion-covariance matrix :

enter image description here

Deviations can be approximated

enter image description here

-> N - sample size (in this example, N = 66)

covariances can be approximated

enter image description here

For example, the covariance between r_02 and r_13 is determined

enter image description here

Now I want to define a function in R that receives the correlation matrix as input and returns the variance-covariance matrix. However, I have problems implementing the covariance calculation. My idea is to name the elements of the correlation_vector as shown above (r_01, r_02 ...). Then I want to create an empty variance-covariance matrix that has a correlation_vector length. Rows and columns must have the same names as resource_vector, so I can name them, for example, [01] [03]. Then I want to implement a for loop that sets the values โ€‹โ€‹of i and j, as well as k and l, as shown in the formula for covariance of columns and rows of correlations that I need to enter the covariance formula. It should always be six different values โ€‹โ€‹(ij; ik; il; jk; jl; lk). This is my idea, but I do not understand how to implement this in R.

This is my code (without calculating covariance):

require(corpcor) correlation_matrix_input <- matrix(data=c(1.00,0.561,0.393,0.561,0.561,1.00,0.286,0.549,0.393,0.286,1.00,0.286,0.561,0.549,0.286,1.00),ncol=4,byrow=T) N <- 66 # Sample Size vector_of_correlations <- sm2vec(correlation_matrix_input, diag=F) # lower triangular matrix of correlation_matrix_input variance_covariance_matrix <- matrix(nrow = length(vector_of_correlations), ncol = length(vector_of_correlations)) # creates the empty variance-covariance matrix # function to fill the matrix by calculating the variance and the covariances variances_covariances <- function(vector_of_correlations_input, sample_size) { for (i in (seq(along = vector_of_correlations_input))) { for (j in (seq(along = vector_of_correlations_input))) { # calculate the variances for the diagonale if (i == j) { variance_covariance_matrix[i,j] = ((1-vector_of_correlations_input[i]**2)**2)/sample_size } # calculate the covariances if (i != j) { variance_covariance_matrix[i,j] = ??? } } } return(variance_covariance_matrix); } 

Does anyone have an idea how to implement covariance calculation using the above formula?

I would be grateful for any help in solving this problem !!!

+4
source share
3 answers

This is easier if you save r as a matrix and use this helper function to make things clearer:

 covr <- function(r, i, j, k, l, n){ if(i==k && j==l) return((1-r[i,j]^2)^2/n) ( 0.5 * r[i,j]*r[k,l]*(r[i,k]^2 + r[i,l]^2 + r[j,k]^2 + r[j,l]^2) + r[i,k]*r[j,l] + r[i,l]*r[j,k] - (r[i,j]*r[i,k]*r[i,l] + r[j,i]*r[j,k]*r[j,l] + r[k,i]*r[k,j]*r[k,l] + r[l,i]*r[l,j]*r[l,k]) )/n } 

Now define this second function:

 vcovr <- function(r, n){ p <- combn(nrow(r), 2) q <- seq(ncol(p)) outer(q, q, Vectorize(function(x,y) covr(r, p[1,x], p[2,x], p[1,y], p[2,y], n))) } 

And voila:

 > vcovr(correlation_matrix_input, 66) [,1] [,2] [,3] [,4] [,5] [,6] [1,] 0.007115262 0.001550264 0.002917481 0.003047666 0.003101602 0.001705781 [2,] 0.001550264 0.010832674 0.001550264 0.006109565 0.001127916 0.006109565 [3,] 0.002917481 0.001550264 0.007115262 0.001705781 0.003101602 0.003047666 [4,] 0.003047666 0.006109565 0.001705781 0.012774221 0.002036422 0.006625868 [5,] 0.003101602 0.001127916 0.003101602 0.002036422 0.007394554 0.002036422 [6,] 0.001705781 0.006109565 0.003047666 0.006625868 0.002036422 0.012774221 

EDIT:

For converted Z values, as in your comment, you can use this:

 covrZ <- function(r, i, j, k, l, n){ if(i==k && j==l) return(1/(n-3)) covr(r, i, j, k, l, n) / ((1-r[i,j]^2)*(1-r[k,l]^2)) } 

And just replace it with vcovr :

 vcovrZ <- function(r, n){ p <- combn(nrow(r), 2) q <- seq(ncol(p)) outer(q, q, Vectorize(function(x,y) covrZ(r, p[1,x], p[2,x], p[1,y], p[2,y], n))) } 

New result:

 > vcovrZ(correlation_matrix_input,66) [,1] [,2] [,3] [,4] [,5] [,6] [1,] 0.015873016 0.002675460 0.006212598 0.004843517 0.006478743 0.002710920 [2,] 0.002675460 0.015873016 0.002675460 0.007869213 0.001909452 0.007869213 [3,] 0.006212598 0.002675460 0.015873016 0.002710920 0.006478743 0.004843517 [4,] 0.004843517 0.007869213 0.002710920 0.015873016 0.003174685 0.007858948 [5,] 0.006478743 0.001909452 0.006478743 0.003174685 0.015873016 0.003174685 [6,] 0.002710920 0.007869213 0.004843517 0.007858948 0.003174685 0.015873016 
+4
source

I wrote an approach using combn and row / column indexes to generate various combinations of p .

 variances_covariances <- function(m, n) { r <- m[lower.tri(m)] var <- (1-r^2)^2 ## generate row/column indices rowIdx <- rep(1:nrow(m), times=colSums(lower.tri(m))) colIdx <- rep(1:ncol(m), times=rowSums(lower.tri(m))) ## generate combinations cov <- combn(length(r), 2, FUN=function(i) { ## current row/column indices cr <- rowIdx[i] ## i,k cc <- colIdx[i] ## j,l ## define 6 cases p.ij <- m[cr[1], cc[1]] p.ik <- m[cr[1], cr[2]] p.il <- m[cr[1], cc[2]] p.jk <- m[cc[1], cr[2]] p.jl <- m[cc[1], cc[2]] p.kl <- m[cr[2], cc[2]] ## calculate covariance co <- 0.5 * p.ij * p.kl * (p.ik^2 + p.il^2 + p.jk^2 + p.jl^2) + p.ik * p.jl + p.il * p.jk - (p.ij * p.ik * p.il + p.ij * p.jk * p.jl + p.ik * p.jk * p.kl + p.il * p.jl * p.kl) return(co) }) ## create output matrix com <- matrix(NA, ncol=length(r), nrow=length(r)) com[lower.tri(com)] <- cov com[upper.tri(com)] <- t(com)[upper.tri(com)] diag(com) <- var return(com/n) } 

Output:

 m <- matrix(data=c(1.000, 0.561, 0.393, 0.561, 0.561, 1.000, 0.286, 0.549, 0.393, 0.286, 1.000, 0.286, 0.561, 0.549, 0.286, 1.00), ncol=4, byrow=T) variances_covariances(m, 66) # [,1] [,2] [,3] [,4] [,5] [,6] #[1,] 0.007115262 0.001550264 0.001550264 0.003101602 0.003101602 0.001705781 #[2,] 0.001550264 0.010832674 0.010832674 0.001127916 0.001127916 0.006109565 #[3,] 0.001550264 0.010832674 0.007115262 0.001127916 0.001127916 0.006109565 #[4,] 0.003101602 0.001127916 0.001127916 0.012774221 0.007394554 0.002036422 #[5,] 0.003101602 0.001127916 0.001127916 0.007394554 0.007394554 0.002036422 #[6,] 0.001705781 0.006109565 0.006109565 0.002036422 0.002036422 0.012774221 

I hope I did everything right.

+2
source

sala / hello

 variance_covariance_matrix<- diag (variance vector, length (r),length (r)) pcomb <- combn(length(r), 2) for (k in 1:length(r)){ i<- pcomb[1,k] j<- pcomb[2,k] variance_covariance_matrix[i,j]<- variance_covariance_matrix [j,i]<- genCorr[k] * sqrt (sig2g[i]) * sqrt (sig2g[j]) } 
-2
source

Source: https://habr.com/ru/post/1499915/


All Articles