How did the cor () function speed up?

Slightly off topic, but I was wondering, can someone tell me when and how the cor () function has been improved recently? This is much faster than I remember, and now it is comparable in speed to the rcorr function in the HMisc package, which was my alternative correlation function for large matrices.

Thanks for all the suggestions: After some research, the speed difference is related to using the use = "pairwise" flag instead of an algorithmic change. When using this option, the difference in speed difference is ~ 8 times.

The speed for cor () on R from version 2.4 - 2.13 is comparable.

Thanks,

Yane

+6
source share
2 answers

http://cran.r-project.org/src/base/NEWS.html provides a summary of the latest changes and an explanation of their relevance. It is sometimes useful to pick up related changes in other functions that may affect what you do. However, a quick find for cor() shows only a couple of things:

2.13.0

The rank correlation methods for cor () and cov () using = "complete.obs" calculated the ranks before removing the missing values, while the documentation implied incomplete cases were removed first. (https://bugs.R-project.org/bugzilla3/show_bug.cgi?id=14488PR#14488)

2.11.0

cor () and cov () now check for misuse by non-numeric arguments, for example, the error report https://bugs.R-project.org/bugzilla3/show_bug.cgi?id=14207PR#14207 .

+5
source

It's hard to say without knowing which version you are using, but it looks like there are significant changes in 2.14, and only minor changes between 2.13 and previous versions will return at least to 2.10. Compare them to see the current changes in 2.14:

2.13 code: https://svn.r-project.org/R/branches/R-2-13-branch/src/main/cov.c

2.14 code: https://svn.r-project.org/R/branches/R-2-14-branch/src/main/cov.c

+3
source

Source: https://habr.com/ru/post/899474/


All Articles