How to control the number of processors used by R?

I use a cross-card R package , which itself relies on some other R packages (survival, nbpMatching, MASS), and this in turn imports a wide range of additional dependencies. The crossmatch package implements a statistical test on a (potentially) large matrix, which I need to calculate very often (within the framework of the MCMC algorithm). I wrote the following shell, which calculates some preprocessing steps before calculating the actual test (which is crossmatchtest()on the last line):

# wrapper function to directly call the crossmatch test with a single matrix
# first column of the matrix must be a binary group indicator, following columns are observations
# code is modified from the documentation of the crossmatch package
crossmatchdata <- function(dat) {

  # the grouping variable should be in the first column
  z = dat[,1]
  X = subset(dat, select = -1)

  ## Rank based Mahalanobis distance between each pair:
  # X <- as.matrix(X)
  n <- dim(X)[1]
  k <- dim(X)[2]

  for (j in 1:k) {
    X[, j] <- rank(X[, j])
  }

  cv <- cov(X)
  vuntied <- var(1:n)
  rat <- sqrt(vuntied / diag(cv))

  cv <- diag(rat) %*% cv %*% diag(rat)
  out <- matrix(NA, n, n)

  icov <- ginv(cv)
  for (i in 1:n) {
    out[i, ] <- mahalanobis(X, X[i, ], icov, inverted = TRUE)
  }

  dis <- out

  ## The cross-match test:
  return(crossmatchtest(z, dis))
}

I noticed that if the matrix is ​​quite small, only one processor will be used in this test:

library(MASS)
library(crossmatch)
source("theCodeFromAbove.R")
# create a dummy matrix
m = cbind(c(rep(0, 100), rep(1, 100)))
m = cbind(m, (matrix(runif(100), ncol=10, nrow=20, byrow=T)))
while(TRUE) { crossmatchdata(m) }

htop. , , R , ( , ):

# create a dummy matrix
m = cbind(c(rep(0, 1000), rep(1, 1000)))
m = cbind(m, (matrix(runif(100000), ncol=1000, nrow=2000, byrow=T)))
while(TRUE) { crossmatchdata(m) }

, , R. options(mc.cores = 4) .

- , ? , ?

+4
1

:

library(miniCRAN)
tags <- "crossmatch"
dg <- makeDepGraph(tags, enhances = FALSE, suggests = FALSE)
set.seed(1)
plot(dg, legendPosition = c(-1, 1), vertex.size = 20)

final dependency graph

. , R. , . data.table( ), , setDTthreads(1) .

, R, BLAS. , , .

Update:

@Dirk Eddelbuettel , RhpcBLASctl OpenMPController , BLAS OpenMP.

kartoffelsalat:

Ubuntu 16.04 . macOS ( OpenMPController).

library(RhpcBLASctl)
blas_set_num_threads(3)
+1

Source: https://habr.com/ru/post/1690775/


All Articles