Set the maximum number of threads at runtime to numpy / openblas

I would like to know if it is possible to change to (Python) runtime the maximum number of threads used by OpenBLAS for numpy?

I know that you can set it before starting the interpreter through the OMP_NUM_THREADS environment OMP_NUM_THREADS , but I would like to change it at runtime.

Typically, when using MKL instead of OpenBLAS it is possible:

 import mkl mkl.set_num_threads(n) 
+8
source share
2 answers

You can do this by calling the openblas_set_num_threads function using ctypes . I often find that I want to do this, so a small context manager wrote:

 import contextlib import ctypes from ctypes.util import find_library # Prioritize hand-compiled OpenBLAS library over version in /usr/lib/ # from Ubuntu repos try_paths = ['/opt/OpenBLAS/lib/libopenblas.so', '/lib/libopenblas.so', '/usr/lib/libopenblas.so.0', find_library('openblas')] openblas_lib = None for libpath in try_paths: try: openblas_lib = ctypes.cdll.LoadLibrary(libpath) break except OSError: continue if openblas_lib is None: raise EnvironmentError('Could not locate an OpenBLAS shared library', 2) def set_num_threads(n): """Set the current number of threads used by the OpenBLAS server.""" openblas_lib.openblas_set_num_threads(int(n)) # At the time of writing these symbols were very new: # https://github.com/xianyi/OpenBLAS/commit/65a847c try: openblas_lib.openblas_get_num_threads() def get_num_threads(): """Get the current number of threads used by the OpenBLAS server.""" return openblas_lib.openblas_get_num_threads() except AttributeError: def get_num_threads(): """Dummy function (symbol not present in %s), returns -1.""" return -1 pass try: openblas_lib.openblas_get_num_procs() def get_num_procs(): """Get the total number of physical processors""" return openblas_lib.openblas_get_num_procs() except AttributeError: def get_num_procs(): """Dummy function (symbol not present), returns -1.""" return -1 pass @contextlib.contextmanager def num_threads(n): """Temporarily changes the number of OpenBLAS threads. Example usage: print("Before: {}".format(get_num_threads())) with num_threads(n): print("In thread context: {}".format(get_num_threads())) print("After: {}".format(get_num_threads())) """ old_n = get_num_threads() set_num_threads(n) try: yield finally: set_num_threads(old_n) 

You can use it as follows:

 with num_threads(8): np.dot(x, y) 

As noted in the comments, openblas_get_num_threads and openblas_get_num_procs were very new features at the time of writing and therefore could not be available unless you compiled OpenBLAS from the latest version of the source code.

+12
source

We recently developed threadpoolctl , a cross-platform package for managing the number of threads used when invoking C level thread pools in python. It works similarly to @ali_m's answer, but it automatically detects libraries that need to be limited by looking at all loaded libraries. It also comes with an introspection API.

This package can be installed using pip install threadpoolctl and comes with a context manager that allows you to control the number of threads used by packages, such as numpy :

 from threadpoolctl import threadpool_limits import numpy as np with threadpool_limits(limits=1, user_api='blas'): # In this block, calls to blas implementation (like openblas or MKL) # will be limited to use only one thread. They can thus be used jointly # with thread-parallelism. a = np.random.randn(1000, 1000) a_squared = a @ a 

You can also have more precise control on various threadpools (e.g. differenciating blas from openmp calls).

Note: this package is under development and any feedback is appreciated.

0
source

Source: https://habr.com/ru/post/984943/


All Articles