R ginv and Matlab pinv give different results

ginv() from the MASS package to R ginv() completely different values โ€‹โ€‹compared to the MATLAB pinv() function. They both claim to produce a generalized inverse Moore-Penrose matrix.

I tried to set the same tolerance for the implementation of R, but the difference persists.

  • MATLAB default tol: max(size(A)) * norm(A) * eps(class(A))
  • R default tol: sqrt(.Machine$double.eps)

reproduction:

R:

 library(MASS) A <- matrix(c(47,94032,149, 94032, 217179406,313679,149,313679,499),3,3) ginv(A) 

outputs:

  [,1] [,2] [,3] [1,] 1.675667e-03 -8.735203e-06 5.545605e-03 [2,] -8.735203e-06 5.014084e-08 -2.890907e-05 [3,] 5.545605e-03 -2.890907e-05 1.835313e-02 

svd(A)

outputs:

 $d [1] 2.171799e+08 4.992800e+01 2.302544e+00 $u [,1] [,2] [,3] [1,] -0.0004329688 0.289245088 -9.572550e-01 [2,] -0.9999988632 -0.001507826 -3.304234e-06 [3,] -0.0014443299 0.957253888 2.892454e-01 $v [,1] [,2] [,3] [1,] -0.0004329688 0.289245088 -9.572550e-01 [2,] -0.9999988632 -0.001507826 -3.304234e-06 [3,] -0.0014443299 0.957253888 2.892454e-01 

MATLAB:

 A = [47 94032 149; 94032 217179406 313679; 149 313679 499] pinv(A) 

outputs:

 ans = 0.3996 -0.0000 -0.1147 -0.0000 0.0000 -0.0000 -0.1147 -0.0000 0.0547 

SVD:

 [U, S, V] = svd(A) U = -0.0004 0.2892 -0.9573 -1.0000 -0.0015 -0.0000 -0.0014 0.9573 0.2892 S = 1.0e+008 * 2.1718 0 0 0 0.0000 0 0 0 0.0000 V = -0.0004 0.2892 -0.9573 -1.0000 -0.0015 -0.0000 -0.0014 0.9573 0.2892 

Solution: to make R ginv look like MATLAB pinv use this function:

 #' Pseudo-Inverse of Matrix #' @description #' This is the modified version of ginv function in MASS package. #' It produces MATLAB like pseudo-inverse of a matrix #' @param X The matrix to compute the pseudo-inverse #' @param tol The default is the same as MATLAB pinv function #' #' @return The pseudo inverse of the matrix #' @export #' #' @examples #' A <- matrix(1:6,3,2) #' pinv(A) pinv <- function (X, tol = max(dim(X)) * max(X) * .Machine$double.eps) { if (length(dim(X)) > 2L || !(is.numeric(X) || is.complex(X))) stop("'X' must be a numeric or complex matrix") if (!is.matrix(X)) X <- as.matrix(X) Xsvd <- svd(X) if (is.complex(X)) Xsvd$u <- Conj(Xsvd$u) Positive <- any(Xsvd$d > max(tol * Xsvd$d[1L], 0)) if (Positive) Xsvd$v %*% (1 / Xsvd$d * t(Xsvd$u)) else array(0, dim(X)[2L:1L]) } 
+5
source share
1 answer

Running debugonce(MASS::ginv) , we see that the difference is what is done with the decomposition of the singular value.

In particular, R checks the following:

 Xsvd <- svd(A) Positive <- Xsvd$d > max(tol * Xsvd$d[1L], 0) Positive # [1] TRUE TRUE FALSE 

If the third element was true (which we can force by setting tol = 0 , as suggested by @nicola), MASS::ginv will return:

 Xsvd$v %*% (1/Xsvd$d * t(Xsvd$u)) # [,1] [,2] [,3] # [1,] 3.996430e-01 -7.361507e-06 -1.147047e-01 # [2,] -7.361507e-06 5.014558e-08 -2.932415e-05 # [3,] -1.147047e-01 -2.932415e-05 5.468812e-02 

(i.e. the same as MATLAB).

Instead, it returns:

 Xsvd$v[, Positive, drop = FALSE] %*% ((1/Xsvd$d[Positive]) * t(Xsvd$u[, Positive, drop = FALSE])) # [,1] [,2] [,3] # [1,] 1.675667e-03 -8.735203e-06 5.545605e-03 # [2,] -8.735203e-06 5.014084e-08 -2.890907e-05 # [3,] 5.545605e-03 -2.890907e-05 1.835313e-02 

Thanks to @FaridCher for specifying pinv source code.

I'm not sure I understand the MATLAB code 100%, but I think it comes down to the difference in using tol . The correspondence of MATLAB Positive in R is:

 `r = sum(s>tol)` 

Where tol is what the user provides; if none are specified, we get:

 m = 0; % I don't get the point of this for loop -- why not just `m = max(size(A))`? for i = 1:nm = max(m,length(A(:,i))); end % contrast with simply `tol * Xsvd$d[1L]` in R % (note: i believe the elements of d are sorted largest to smallest) tol = m*eps(max(s)); 
+4
source

Source: https://habr.com/ru/post/1246363/


All Articles