Why does the TreeBagger in Matlab 2014a / b use only a few workers from the parallel pool?

I use the TreeBagger class provided by Matlab (R2014a & b), in combination with a set of distributed computing tools. I have a local cluster working with 30 workers on a Windows 7 machine with 40 cores.

I call the TreeBagger constructor to create a regression forest (ensemble containing 32 trees), passing the options structure with 'UseParallel' set to 'always' .

However, TreeBagger apparently only uses 8 workers, out of 30 available (judging by the CPU usage for the process observed using the task manager). When I try to check the pool with a simple parfor loop:

 parfor i=1:30 a = fft(rand(20000)); end 

Then all 30 workers are involved.

My question is: (How) can I get TreeBagger use all available resources?

+5
source share
1 answer

Based on the documentation for the TreeBagger class, it seems that the required operations are quite memory intensive. Without knowing more about the internal planning system used by Matlab, it seems likely that distributing the workload to fewer workers with more memory for each worker is what, according to the planner, would be the most effective way to solve the problem.

The number of workers used / available may also depend on the number of physical cores in the system (other than the number of cores with hypervity), as well as the resources that Matlab can consume.

Dividing intensive memory tasks into less than the maximum number of workers is a common technique in HPC for some types of problems.

0
source

Source: https://habr.com/ru/post/1208360/


All Articles